Probabilistic Back-ends for Online Speaker Recognition and Clustering

Page view(s)
9
Checked on Apr 18, 2025
Probabilistic Back-ends for Online Speaker Recognition and Clustering
Title:
Probabilistic Back-ends for Online Speaker Recognition and Clustering
Journal Title:
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Keywords:
Publication Date:
05 May 2023
Citation:
Sholokhov, A., Kuzmin, N., Lee, K. A., & Chng, E. S. (2023, June 4). Probabilistic Back-ends for Online Speaker Recognition and Clustering. ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). https://doi.org/10.1109/icassp49357.2023.10097032
Abstract:
This paper focuses on multi-enrollment speaker recognition which naturally occurs in the task of online speaker clustering, and studies the properties of different scoring back-ends in this scenario. First, we show that popular cosine scoring suffers from poor score calibration with a varying number of enrollment utterances. Second, we propose a simple replacement for cosine scoring based on an extremely constrained version of probabilistic linear discriminant analysis (PLDA). The proposed model improves over the cosine scoring for multi-enrollment recognition while keeping the same performance in the case of one-to-one comparisons. Finally, we consider an online speaker clustering task where each step naturally involves multi-enrollment recognition. We propose an online clustering algorithm allowing us to take benefits from the PLDA model such as the ability to handle uncertainty and better score calibration. Our experiments demonstrate the effectiveness of the proposed algorithm.
License type:
Publisher Copyright
Funding Info:
There was no specific funding for the research done
Description:
© 2023 IEEE. Published in ICASSP 2023 – 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), scheduled for 4-9 June 2023 in Rhodes Island, Greece. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE.
ISSN:
2379-190X
Files uploaded:

File Size Format Action
25746-final-paper.pdf 331.04 KB PDF Request a copy