Investigate automatic speech recognition and keyword search for very low-resource language

Page view(s)

Checked on Jun 26, 2025

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/13754

Title:

Investigate automatic speech recognition and keyword search for very low-resource language

Journal Title:

Proceedings of IEEE 2nd International Conference on Signal and Image Processing (ICSIP)

DOI:

Publication URL:

Authors:

Bin Ma, Chongjia Ni

Keywords:

Publication Date:

06 August 2017

Citation:

Abstract:

In this paper, pronunciation lexicon, multi-lingual bottleneck features, semi-supervised learning, and data selection are investigated to help to improve the performance of automatic speech recognition (ASR) and keyword search (KWS) under very low-resource condition. For very low-resource condition, it is just about 3 hours of transcribed speech data, and there is no manual pronunciation for words in the transcription. According to our experiments on OpenKWS15 surprise language Swahili, some significant results can conclude. (1) Pronunciation lexicon has great influence on the performance of keyword search system at very limited language package (VLLP) condition when comparing with full language package (FLP) condition. (2) Multi-lingual bottleneck features (BNF) can improve the performance of ASR and KWS, and when combining with semi-supervised learning, the performance further improve. (3) Using large scale text corpus to train language model (LM), it can greatly improve the performance of KWS system and corresponding underlying ASR. When extending vocabulary size for keyword search, it can reduce out-of-vocabulary in keyword list, and thus slightly improve the performance of KWS system. (4) Initial transcription data selection is important to improve the performance of KWS and underlying ASR system.

License type:

PublisherCopyrights

Funding Info:

Description:

URI:

https://oar.a-star.edu.sg/communities-collections/articles/13754

ISBN:

Collections:

Institute for Infocomm Research

Files uploaded:

Manuscripts in This Item:

File	Size	Format	Action
investigate-automatic-speech-recognition-and-keyword-search-for-very-low-resource-language-final.pdf	273.03 KB	PDF	Open