On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion

On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion
Title:
On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion
Other Titles:
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Publication Date:
20 February 2020
Citation:
B. Sisman, M. Zhang, M. Dong and H. Li, "On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion," 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), SG, Singapore, 2019, pp. 144-151, doi: 10.1109/ASRU46091.2019.9003939.
Abstract:
Cross-lingual voice conversion (VC) aims to convert the source speaker's voice to sound like that of the target speaker, when the source and target speakers speak different languages. In this paper, we propose to use Generative Adversarial Networks (GANs) for cross-lingual voice-conversion. We further the studies on Variational Autoencoding Wasserstein GAN (VAW-GAN) and cycle-consistent adversarial network (CycleGAN), that are known to be effective for mono-lingual voice conversion. As cross-lingual voice conversion needs to converts the voice across different phonetic system, it is more challenging than mono-lingual voice conversion. By using VAW-GAN and CycleGAN, we successfully convert the speaker identity while carrying over the source speaker's linguistic content. The proposed idea is unique in the sense that it neither relies on bilingual data and their alignment, nor any external process, such as ASR. Moreover, it works with limited amount of training data of any two languages. To our best knowledge, this is the first comprehensive study of Generative Adversarial Networks in cross-lingual voice conversion. In the experiments, we achieve high-quality converted voice, that performs equally well or better than mono-lingual voice conversion.
License type:
PublisherCopyrights
Funding Info:
This research is supported by Programmatic grant no. A18A2b0046 from the Singapore Governments Research, Innovation and Enterprise 2020 plan (Advanced Manufacturing and Engineering domain) and by the National Research Foundation Singapore under its AI Singapore Programme (Award Number: AISG-100E-2018-006).
Description:
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISBN:
978-1-7281-0306-8
978-1-7281-0305-1
978-1-7281-0307-5
Files uploaded:
File Size Format Action
There are no attached files.