Speech Foundation Model Ensembles for the Controlled Singing Voice Deepfake Detection (CTRSVDD) Challenge 2024

Page view(s)

Checked on Aug 05, 2025

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/21473

Title:

Speech Foundation Model Ensembles for the Controlled Singing Voice Deepfake Detection (CTRSVDD) Challenge 2024

Journal Title:

2024 IEEE Spoken Language Technology Workshop (SLT)

DOI:

10.1109/SLT61566.2024.10832226

Publication URL:

https://doi.org/10.1109/slt61566.2024.10832226

Authors:

Anmol Guragain, Tianchi Liu, Zihan Pan, Hardik B. Sailor, Qiongqiong Wang

Keywords:

Publication Date:

16 January 2025

Citation:

Guragain, A., Liu, T., Pan, Z., Sailor, H. B., & Wang, Q. (2024). Speech Foundation Model Ensembles for the Controlled Singing Voice Deepfake Detection (CTRSVDD) Challenge 2024. 2024 IEEE Spoken Language Technology Workshop (SLT), 774–781. https://doi.org/10.1109/slt61566.2024.10832226

Abstract:

This work details our approach to achieving a leading system with a 1.79% pooled equal error rate (EER) on the evaluation set of the Controlled Singing Voice Deepfake Detection (CtrSVDD). The rapid advancement of generative AI models presents significant challenges for detecting AI-generated deepfake singing voices, attracting increased research attention. The Singing Voice Deepfake Detection (SVDD) Challenge 2024 aims to address this complex task. In this work, we explore the ensemble methods, utilizing speech foundation models to develop robust singing voice anti-spoofing systems. We also introduce a novel Squeeze-and-Excitation Aggregation (SEA) method, which efficiently and effectively integrates representation features from the speech foundation models, surpassing the performance of our other individual systems. Evaluation results confirm the efficacy of our approach in detecting deepfake singing voices.

License type:

Publisher Copyright

Funding Info:

This research / project is supported by the National Research Foundation, Prime Minister’s Office, Singapore, and the Ministry of Digital Development and Information - Online Trust and Safety (OTS) Research Programme
Grant Reference no. : MCI-OTS-001

Description:

© 2025 IEEE.  Personal use of this material is permitted.  Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

URI:

https://oar.a-star.edu.sg/communities-collections/articles/21473

ISBN:

979-8-3503-9226-5

Collections:

Institute for Infocomm Research

Files uploaded:

Manuscripts in This Item:

File	Size	Format	Action
sltxsvdd2024-7.pdf	757.29 KB	PDF	Request a copy