Efficient Methods to Train Multilingual Bottleneck Feature Extractors for Low Resource Keyword Search

Efficient Methods to Train Multilingual Bottleneck Feature Extractors for Low Resource Keyword Search
Title:
Efficient Methods to Train Multilingual Bottleneck Feature Extractors for Low Resource Keyword Search
Other Titles:
2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Publication Date:
05 March 2017
Citation:
C. Ni, C. Leung, L. Wang, N. F. Chen and B. Ma, "Efficient methods to train multilingual bottleneck feature extractors for low resource keyword search," 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, 2017, pp. 5650-5654. doi: 10.1109/ICASSP.2017.7953238
Abstract:
Training a bottleneck feature (BNF) extractor with multilingual data has been common in low resource keyword search. In a low-resource application, the amount of transcribed target language data is limited while there are usually plenty of multilingual data. In this paper, we investigated two methods to train efficient multilingual BNF extractors for low resource keyword search. One method is to use the target language data to update an existing BNF extractor, and another method is to combine the target language data to re-train the multilingual BNF extractor from the start. In these two methods, we proposed to use long short-term memory recurrent neural network based language identification to select utterances in the multilingual training data that are acoustically close to the target language. Experiments on Swahili in the OpenKWS15 data demonstrated the efficiency of our proposed methods. The former facilitates rapid system development, while both outperform baseline BNF extractors in terms of accuracy.
License type:
PublisherCopyrights
Funding Info:
Description:
(c) 2017 IEEE.
ISSN:
2379-190X
ISBN:
978-1-5090-4117-6
978-1-5090-4116-9
978-1-5090-4118-3
Files uploaded:
File Size Format Action
There are no attached files.