Joint Chinese word segmentation and punctuation prediction using deep recurrent neural network for social media data

Joint Chinese word segmentation and punctuation prediction using deep recurrent neural network for social media data
Title:
Joint Chinese word segmentation and punctuation prediction using deep recurrent neural network for social media data
Other Titles:
2015 International Conference on Asian Language Processing (IALP)
DOI:
10.1109/IALP.2015.7451527
Publication Date:
24 October 2015
Citation:
Kui Wu, Xuancong Wang, Nina Zhou, AiTi Aw and Haizhou Li, "Joint Chinese word segmentation and punctuation prediction using deep recurrent neural network for social media data," 2015 International Conference on Asian Language Processing (IALP), Suzhou, 2015, pp. 41-44. doi: 10.1109/IALP.2015.7451527
Abstract:
In this work, we propose to jointly perform Chinese word segmentation (CWS) and punctuation prediction (PU) in a unified framework using deep recurrent neural network (DRNN). We further perform a comparative study among the joint frameworks, the isolated prediction and the pipeline methods that link the two tasks sequentially, on a social media corpus. Our experimental results show that joint models improve performance of CWS and affect PU marginally. We also study the effects of CWS and PU on Chinese-to-English machine translation (MT) quality by evaluating on a parallel social media corpus. It is shown that joint models are superior to the isolated prediction and the pipeline approaches.
License type:
PublisherCopyrights
Funding Info:
Description:
(c) 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
ISBN:
978-1-4673-9595-3
Files uploaded: