Transfer learning for Children's speech recognition

Page view(s)

Checked on Aug 04, 2025

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/14477

Title:

Transfer learning for Children's speech recognition

Journal Title:

2017 International Conference on Asian Language Processing (IALP)

DOI:

10.1109/IALP.2017.8300540

Publication URL:

https://doi.org/10.1109/IALP.2017.8300540

Authors:

Rong Tong, Lei Wang, Bin Ma

Keywords:

automatic speech recognition, acoustic model, Transfer Learning, Multi-task learning, children’s speech processing

Publication Date:

05 December 2017

Citation:

R. Tong, L. Wang and B. Ma, "Transfer learning for children's speech recognition," 2017 International Conference on Asian Language Processing (IALP), Singapore, 2017, pp. 36-39. doi: 10.1109/IALP.2017.8300540

Abstract:

Children’s speech processing is more challenging than that of adults due to the lacking of large scale children’s speech corpora. With the developing of the physical speech organ, higher inter speaker and intra speaker variabilities are observed in children’s speech. On the other hand, data collection on children is difficult as children have limited language proficiency, and they usually have shorter attention span. In this paper, we aiming to improve children’s automatic speech recognition performance with transfer learning. We compare two transfer learning approaches in enhancing children’s ASR with adult’s data. The first method is to obtain children’s acoustic model by performing acoustic adaptation on the pre-trained adult model. The second method is multi-task learning, the adult and children’s acoustic characteristics are learnt jointly in the shared hidden layers, while the output layers are optimized with different targets. Our experiment results show that both approaches are effective in transferring rich phonetic and acoustic information from adult to children. The multi-task learning approach outperforms the acoustic adaptation approach. We further show that the transfer learning technique is also effective in transferring speaker’s acoustic characteristics from other languages.

License type:

PublisherCopyrights

Funding Info:

Description:

URI:

https://oar.a-star.edu.sg/communities-collections/articles/14477

ISBN:

978-1-5386-1981-0
978-1-5386-1980-3
978-1-5386-1982-7

Collections:

Institute for Infocomm Research

Files uploaded:

Manuscripts in This Item:

File	Size	Format	Action
There are no attached files.