Duan, R. (2023). Joint Learning Feature and Model Adaptation for Unsupervised Acoustic Modelling of Child Speech. INTERSPEECH 2023. https://doi.org/10.21437/interspeech.2023-1302
Abstract:
Due to the high acoustic variability of child speech and the lack of publicly available datasets, acoustic modeling for child speech is challenging. In this work, we address these challenges by leveraging the large amounts of resources for adult speech (well-trained acoustic models and transcribed speech dataset) and proposing a joint acoustic feature and model adaptation framework to minimize acoustic mismatch between adult and child speech. Empirical results on three tasks of speech recognition, pronunciation assessment, and fluency assessment show that our proposed approach consistently outperforms competitive baselines, achieving up to 31.18% phone error reduction on speech recognition and around 7% gains on speech evaluation tasks.
License type:
Publisher Copyright
Funding Info:
This research is supported by core funding from: I2R
Grant Reference no. : SC20-RD120