Combining Punctuation and Disfluency Prediction: An Empirical Study

Page view(s)

Checked on Aug 11, 2025

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/12852

Title:

Combining Punctuation and Disfluency Prediction: An Empirical Study

Journal Title:

Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing

DOI:

Publication URL:

http://aclweb.org/anthology/D/D14/D14-1013.pdf

Authors:

Xuancong Wang, Hwee Tou Ng, Khe Chai Sim

Keywords:

Publication Date:

25 October 2014

Citation:

Abstract:

Punctuation prediction and disfluency prediction can improve downstream natural language processing tasks such as machine translation and information extraction. Combining the two tasks can potentially improve the efficiency of the overall pipeline system and reduce error propagation. In this work, we compare various methods for combining punctuation prediction (PU) and disfluency prediction (DF) on the Switchboard corpus. We compare an isolated prediction approach with a cascade approach, a rescoring approach, and three joint model approaches. For the cascade approach, we show that the soft cascade method is better than the hard cascade method. We also use the cascade models to generate an n-best list, use the bi-directional cascade models to perform rescoring, and compare that with the results of the cascade models. For the joint model approach, we compare mixed-label Linear-chain Conditional Random Field (LCRF), cross-product LCRF and 2-layer Factorial Conditional Random Field (FCRF) with soft-cascade LCRF. Our results show that the various methods linking the two tasks are not significantly different from one another, although they perform better than the isolated prediction method by 0.5--1.5% in the F1 score. Moreover, the clique order of features also shows a marked difference.

License type:

PublisherCopyrights

Funding Info:

National University of Singapore, Institute for Infocomm Research

Description:

URI:

https://oar.a-star.edu.sg/communities-collections/articles/12852

ISBN:

Collections:

Institute for Infocomm Research

Files uploaded:

Manuscripts in This Item:

File	Size	Format	Action
acl2014.pdf	768.88 KB	PDF	Open