DynaEval: Unifying Turn and Dialogue Level Evaluation

Page view(s)
9
Checked on Mar 14, 2025
DynaEval: Unifying Turn and Dialogue Level Evaluation
Title:
DynaEval: Unifying Turn and Dialogue Level Evaluation
Journal Title:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Keywords:
Publication Date:
27 July 2021
Citation:
Zhang, C., Chen, Y., D’Haro, L. F., Zhang, Y., Friedrichs, T., Lee, G., & Li, H. (2021). DynaEval: Unifying Turn and Dialogue Level Evaluation. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 5676–5689. https://doi.org/10.18653/v1/2021.acl-long.441
Abstract:
A dialogue is essentially a multi-turn interaction among interlocutors. Effective evaluation metrics should reflect the dynamics of such interaction. Existing automatic metrics are focused very much on the turn-level quality, while ignoring such dynamics. To this end, we propose DynaEval, a unified automatic evaluation framework which is not only capable of performing turn-level evaluation, but also holistically considers the quality of the entire dialogue. In DynaEval, the graph convolutional network (GCN) is adopted to model a dialogue in totality, where the graph nodes denote each individual utterance and the edges represent the dependency between pairs of utterances. A contrastive loss is then applied to distinguish well-formed dialogues from carefully constructed negative samples. Experiments show that DynaEval significantly outperforms the state-of-the-art dialogue coherence model, and correlates strongly with human judgements across multiple dialogue evaluation aspects at both turn and dialogue level.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the Agency for Science, Technology and Research (A*STAR) - Advanced Manufacturing and Engineering (AME) Programmatic Funding Scheme
Grant Reference no. : A18A2b0046

This research / project is supported by the National Research Foundation Singapore - National Robotics Programme: Human-Robot Interaction Phase 1
Grant Reference no. : 1922500054

This research / project is supported by the Robert Bosch (SEA) Pte Ltd - EDB’s Industrial Postgraduate Programme – II (EDB-IPP), project title: Applied Natural Language Processing
Grant Reference no. :
Description:
© 2021 Association for Computational Linguistics. Permission is granted to make copies for the purposes of teaching and research. Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.
ISSN:
2021.acl-long.441