Yang, J., Das, R. K., & Zhou, N. (2019). Extraction of Octave Spectra Information for Spoofing Attack Detection. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 27(12), 2373–2384. doi:10.1109/taslp.2019.2946897
Abstract:
This article focuses on extracting information from the octave power spectra of long-term constant-Q transform (CQT) for spoofing attack detection. A novel framework based on multi-level transform (MLT) is proposed that can capture the relevant information from octave power spectra using level by level in a multi-level manner. We then derive a novel feature referred to as constant-Q multi-level coefficient (CMC) based on proposed MLT. The proposed feature is evaluated on synthetic as well as replay speech detection studies on ASVspoof 2015 and ASVspoof 2017 version 2.0 database, respectively. We find the proposed CMC feature outperforms the conventional constant-Q cepstral coefficient based long-term feature obtained from linear power spectrum after uniform resampling. This depicts the usefulness of MLT to extract salient artifacts from octave power spectrum. Further, the proposed CMC feature performs better than the existing the well known other state-of-the-art systems for spoofing attack detection that showcases its importance.
License type:
Publisher Copyright
Funding Info:
This research is funded by National Natural Science Foundation of China under Grant 6177120 Grant 61571192 and grant 61301300.