Lee, K. A., Vestman, V., & Kinnunen, T. (2021). ASVtorch toolkit: Speaker verification with deep neural networks. SoftwareX, 14, 100697. doi:10.1016/j.softx.2021.100697
The human voice differs substantially between individuals. This facilitates automatic speaker verification (ASV) — recognizing a person from his/her voice. ASV accuracy has substantially increased throughout the past decade due to recent advances in machine learning, particularly deep learning methods. An unfortunate downside has been substantially increased complexity of ASV systems. To help non-experts to kick-start reproducible ASV development, a state-of-the-art toolkit implementing various ASV pipelines and functionalities is required. To this end, we introduce a new open-source toolkit, ASVtorch, implemented in Python using the widely used PyTorch machine learning framework.
Attribution 4.0 International (CC BY 4.0)
This work was partially supported by Academy of Finland (project #309629) and by the Doctoral Program in Science, Technology, and Computing (SCITECO) of the University of Eastern Finland (UEF). The authors at UEF were also supported by NVIDIA Corporation with the donation of Titan V GPU.