Diarisation using location tracking with agglomerative clustering

Page view(s)
64
Checked on Jul 27, 2024
Diarisation using location tracking with agglomerative clustering
Title:
Diarisation using location tracking with agglomerative clustering
Journal Title:
Spoken Language Technology Workshop
DOI:
Publication URL:
Keywords:
Publication Date:
09 January 2023
Citation:
NA
Abstract:
Previous works have shown that spatial location information can be complementary to speaker embeddings for a speaker diarisation task. However, the models used often assume that speakers are fairly stationary throughout a meeting. This paper proposes to relax this assumption, by explicitly modelling the movements of speakers within an Agglomerative Hierarchical Clustering (AHC) diarisation framework. Kalman filters, which track the locations of speakers, are used to compute log-likelihood ratios that contribute to the cluster affinity computations for the AHC merging and stopping decisions. Experiments show that the proposed approach is able to yield improvements on a Microsoft rich meeting transcription task, compared to methods that do not use location information or that make stationarity assumptions.
License type:
Publisher Copyright
Funding Info:
There was no specific funding for the research done
Description:
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISSN:
NA
Files uploaded:

File Size Format Action
0000613-amend.pdf 688.53 KB PDF Request a copy