On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion

Page view(s)
34
Checked on Sep 18, 2024
On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion
Title:
On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion
Journal Title:
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Publication Date:
20 February 2020
Citation:
B. Sisman, M. Zhang, M. Dong and H. Li, "On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion," 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), SG, Singapore, 2019, pp. 144-151, doi: 10.1109/ASRU46091.2019.9003939.
Abstract:
Cross-lingual voice conversion (VC) aims to convert the source speaker's voice to sound like that of the target speaker, when the source and target speakers speak different languages. In this paper, we propose to use Generative Adversarial Networks (GANs) for cross-lingual voice-conversion. We further the studies on Variational Autoencoding Wasserstein GAN (VAW-GAN) and cycle-consistent adversarial network (CycleGAN), that are known to be effective for mono-lingual voice conversion. As cross-lingual voice conversion needs to converts the voice across different phonetic system, it is more challenging than mono-lingual voice conversion. By using VAW-GAN and CycleGAN, we successfully convert the speaker identity while carrying over the source speaker's linguistic content. The proposed idea is unique in the sense that it neither relies on bilingual data and their alignment, nor any external process, such as ASR. Moreover, it works with limited amount of training data of any two languages. To our best knowledge, this is the first comprehensive study of Generative Adversarial Networks in cross-lingual voice conversion. In the experiments, we achieve high-quality converted voice, that performs equally well or better than mono-lingual voice conversion.
License type:
Publisher Copyright
Funding Info:
This research is supported by Programmatic grant from the Singapore Governments Research, Innovation and Enterprise 2020 plan (Advanced Manufacturing and Engineering domain).

This research / project is supported by the National Research Foundation, Singapore - AI Singapore Programme
Grant Reference no. : AISG-100E-2018-006
Description:
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISBN:
978-1-7281-0306-8
978-1-7281-0305-1
978-1-7281-0307-5
Files uploaded:

File Size Format Action
asru2019.pdf 377.64 KB PDF Open