You, C. H., & Dong, M. (2024, April 14). A Study on Combining Non-Parallel and Parallel Methodologies for Mandarin-English Cross-Lingual Voice Conversion. ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). https://doi.org/10.1109/icassp48485.2024.10446264
Abstract:
In this paper, we propose a cross-lingual voice conversion (VC) scheme leveraging non-parallel and parallel methodologies. The goal of the VC is to transform the voice of one speaker from a language dataset into the voice of another speaker from a different language dataset. First, two non-parallel methods are separately investigated, they are CyclGAN-VC2 and phonetic posterior-grams (PPG) VC. Second, two different parallel VC systems are developed to enhance the quality of the converted speech spectrogram, where the output speech from the non-parallel VC is used to form the parallel pair with the corresponding original speech.
Focusing on Mandarin-English bilingual databases, the proposed VC scheme improves speech naturalness and speaker similarity as compared to the baseline non-parallel methods.
License type:
Publisher Copyright
Funding Info:
There was no specific funding for the research done