Joint speaker diarisation and tracking in switching state-space model

Page view(s)
50
Checked on Sep 01, 2024
Joint speaker diarisation and tracking in switching state-space model
Title:
Joint speaker diarisation and tracking in switching state-space model
Journal Title:
Spoken Language Technology Workshop
DOI:
Publication URL:
Keywords:
Publication Date:
09 January 2023
Citation:
NA
Abstract:
Speakers may move around while diarisation is being performed. When a microphone array is used, the instantaneous locations of where the sounds originated from can be estimated, and previous investigations have shown that such information can be complementary to speaker embeddings in the diarisation task. However, these approaches often assume that speakers are fairly stationary throughout a meeting. This paper relaxes this assumption, by proposing to explicitly track the movements of speakers while jointly performing diarisation within a unified model. A state-space model is proposed, where the hidden state expresses the identity of the current active speaker and the predicted locations of all speakers. The model is implemented as a particle filter. Experiments on a Microsoft rich meeting transcription task show that the proposed joint location tracking and diarisation approach is able to perform comparably with other methods that use location information.
License type:
Publisher Copyright
Funding Info:
There was no specific funding for the research done
Description:
© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISSN:
NA
Files uploaded:

File Size Format Action
0000605-amended.pdf 989.88 KB PDF Request a copy