Combining Punctuation and Disfluency Prediction: An Empirical Study

Page view(s)
20
Checked on Nov 21, 2024
Combining Punctuation and Disfluency Prediction: An Empirical Study
Title:
Combining Punctuation and Disfluency Prediction: An Empirical Study
Journal Title:
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing
DOI:
Keywords:
Publication Date:
25 October 2014
Citation:
Abstract:
Punctuation prediction and disfluency prediction can improve downstream natural language processing tasks such as machine translation and information extraction. Combining the two tasks can potentially improve the efficiency of the overall pipeline system and reduce error propagation. In this work, we compare various methods for combining punctuation prediction (PU) and disfluency prediction (DF) on the Switchboard corpus. We compare an isolated prediction approach with a cascade approach, a rescoring approach, and three joint model approaches. For the cascade approach, we show that the soft cascade method is better than the hard cascade method. We also use the cascade models to generate an n-best list, use the bi-directional cascade models to perform rescoring, and compare that with the results of the cascade models. For the joint model approach, we compare mixed-label Linear-chain Conditional Random Field (LCRF), cross-product LCRF and 2-layer Factorial Conditional Random Field (FCRF) with soft-cascade LCRF. Our results show that the various methods linking the two tasks are not significantly different from one another, although they perform better than the isolated prediction method by 0.5--1.5% in the F1 score. Moreover, the clique order of features also shows a marked difference.
License type:
PublisherCopyrights
Funding Info:
National University of Singapore, Institute for Infocomm Research
Description:
ISBN:

Files uploaded:

File Size Format Action
acl2014.pdf 768.88 KB PDF Open