Implementing Prosodic Phrasing in Chinese End-to-end Speech Synthesis

Page view(s)

Checked on Aug 20, 2025

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/14417

Title:

Implementing Prosodic Phrasing in Chinese End-to-end Speech Synthesis

Journal Title:

2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

DOI:

10.1109/icassp.2019.8682368

Publication URL:

https://doi.org/10.1109/icassp.2019.8682368

Authors:

Yanfeng Lu, Minghui Dong, Ying Chen

Keywords:

Chinese speech synthesis, Tacotron 2, Wavenet vocoder, end-to-end TTS, prosodic phrasing

Publication Date:

12 May 2019

Citation:

Y. Lu, M. Dong and Y. Chen, "Implementing Prosodic Phrasing in Chinese End-to-end Speech Synthesis," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 7050-7054. doi: 10.1109/ICASSP.2019.8682368

Abstract:

Text-to-Speech (TTS) systems have been evolving rapidly in recent years. With the great modelling power of deep neural networks, researchers have achieved end-to-end conversion from raw text to speech. It has been shown by various research projects that end-to-end TTS systems are able to generate speech that sounds akin to human voice for English and other languages. However, for languages like Chinese, there are two problems to deal with. Firstly, due to the large character set, a small input set comparable to the English character set is needed for the end-to-end solution. Secondly, there are serious prosodic phrasing mistakes when the end-to-end method is applied to Chinese. In this paper, we will propose a solution for an end-to-end Chinese TTS system on the basis of Tacotron 2 and Wavenet vocoder. We will then add extra contextual information to improve the performance of prosodic phrasing. Our experiments have demonstrated the effectiveness of this proposal.

License type:

PublisherCopyrights

Funding Info:

National Science Foundation of China, approval number 61573187

Description:

URI:

https://oar.a-star.edu.sg/communities-collections/articles/14417

ISSN:

2379-190X
1520-6149

Collections:

Institute for Infocomm Research

Files uploaded:

Manuscripts in This Item:

File	Size	Format	Action
tacotronenhance.pdf	1.39 MB	PDF	Open