Wong, J. H. M., & Chen, N. F. (2024, April 14). Distilling Distributional Uncertainty from a Gaussian Process. ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). https://doi.org/10.1109/icassp48485.2024.10448172
Abstract:
A Neural Network (NN) may exhibit overconfidence about wrong hypotheses, especially for Out-Of-Domain (OOD) inputs. A Gaussian process (GP) instead has an explainable distributional uncertainty behaviour, by predicting hypotheses with greater uncertainty for query inputs further from the training data. Previous work has shown that a NN can learn to emulate the behaviour of a GP on in-domain data. This paper expands upon this, by proposing to train a NN student to emulate the GP teacher's distributional uncertainty behaviour on OOD data. This avoids the computational cost of using a GP at run-time, while improving the OOD confidence calibration of a NN. More accurate confidence calibration may better inform how the system should feedback to the user. Experiments on the SEP-28k-E stutter detection dataset suggest that distillation of such knowledge is feasible between these models.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the NIE - SpeechEval Phase II: SHE4EDU (Speech Highlighter and Evaluation for Education)
Grant Reference no. : EC-2023-061