J. H. M. Wong, H. Zhang, and N. F. Chen, "Bounded Gaussian process with multiple outputs and ensemble combination," in proc. IEEE CAI, Jun 2024, Singapore, pp. 280-285
Abstract:
Spoken Language Assessment (SLA) is a subjective task, where different human raters often assign differing scores for the same input. It also often has a bounded score range. Prior work of applying a Gaussian Process (GP) to SLA uses a Gaussian output, which is unbounded, and does not consider inter-rater uncertainty. This paper investigates using a bounded beta density function output for a GP in SLA and proposes to extend this bounded GP framework to utilise the multiple output samples per input in the training set. In the experiments, various types of Neural Network (NN) and GP models are trained. This paper investigates combining ensembles of these GPs and NNs. Experiments on the speechocean762 dataset show that using a beta output is better able to predict the inter-rater uncertainty than a Gaussian output. Using multiple output samples in the training set further improves the beta-output GP's inter-rater uncertainty prediction. Combination between a GP and NN yields improvements.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the NIE - SpeechEval Phase II: SHE4EDU (Speech Highlighter and Evaluation for Education)
Grant Reference no. : EC-2023-061