Wong, J. H. M., Zhang, H., & Chen, N. (2022). Variations of multi-task learning for spoken language assessment. Interspeech 2022. https://doi.org/10.21437/interspeech.2022-28
Automatic spoken language assessment often operates within a regime whereby only a limited quantity of training data is available. In other low-resourced tasks, such as in speech recognition, multi-task learning has previously been investigated as a potential approach to regularise the model and maximise the utilisation of the available annotation information during training. This paper applies multi-task learning to spoken language assessment, by assessing three different forms of task diversities. These are, concurrently learning scores at different linguistic levels, different types of scores, and different representations of the same score. Experiments on the speechocean762 dataset suggest that jointly learning from phone and word-level scores yields significant performance gains for the sentence-level score prediction task, and jointly learning from different score types can also be mutually beneficial.
This research / project is supported by the A*STAR - Speech Evaluation for English Reading Aloud
Grant Reference no. : EC-2020-011