SPEECH-MAMBA: LONG-CONTEXT SPEECH RECOGNITION WITH SELECTIVE STATE SPACES MODELS

Page view(s)
34
Checked on Apr 14, 2025
SPEECH-MAMBA: LONG-CONTEXT SPEECH RECOGNITION WITH SELECTIVE STATE SPACES MODELS
Title:
SPEECH-MAMBA: LONG-CONTEXT SPEECH RECOGNITION WITH SELECTIVE STATE SPACES MODELS
Journal Title:
IEEE Spoken Language Technology Workshop 2024
DOI:
Publication Date:
02 December 2024
Citation:
Gao, Xiaoxue, and Nancy F. Chen. "Speech-mamba: Long-context speech recognition with selective state spaces models," in IEEE Spoken Language Technology Workshop, 2024, pp. 182--189.
Abstract:
Current automatic speech recognition systems struggle with modeling long speech sequences due to high quadratic complexity of Transformer-based models. Selective state space models such as Mamba has performed well on long-sequence modeling in natural language processing and computer vision tasks. However, research endeavors in speech technology tasks has been under-explored. We propose Speech-Mamba, which incorporates selective state space modeling in Transformer neural architectures. Long sequence representations with selective state space models in Speech-Mamba is complemented with lower-level representations from Transformer-based modeling. Speech-mamba achieves better capacity to model long-range dependencies, as it scales near-linearly with sequence length.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the Agency for Science, Technology and Research (A*STAR), and Institute for Infocomm Research (I2R) - SpeechEval Phase II: SHE4EDU (Speech Highlighter and Evaluation for Education)
Grant Reference no. : EC-2023-061
Description:
© 2024 IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISBN:

Files uploaded:

File Size Format Action
slt-speech-mamba-4.pdf 201.92 KB PDF Request a copy