DSPM-NLG: A Dual Supervised Pre-trained Model for Few-shot Natural Language Generation in Task-oriented Dialogue System

Page view(s)
84
Checked on Sep 16, 2024
DSPM-NLG: A Dual Supervised Pre-trained Model for Few-shot Natural Language Generation in Task-oriented Dialogue System
Title:
DSPM-NLG: A Dual Supervised Pre-trained Model for Few-shot Natural Language Generation in Task-oriented Dialogue System
Journal Title:
Findings of the Association for Computational Linguistics: ACL 2023
Keywords:
Publication Date:
05 August 2023
Citation:
Wang, Y., Zou, B., Fan, R., Aw, A. T., & He, T. (2023). DSPM-NLG: A Dual Supervised Pre-trained Model for Few-shot Natural Language Generation in Task-oriented Dialogue System. Findings of the Association for Computational Linguistics: ACL 2023. https://doi.org/10.18653/v1/2023.findings-acl.784
Abstract:
In few-shot settings, fully conveying the semantic information of the dialogue act is a crucial challenge for Natural Language Generation (NLG) in the task-oriented dialogue system. An interesting fact is that NLG and Spoken Language Understanding (SLU) are a natural dual problem pair. Suppose the response generated by the NLG module can be restored to the corresponding dialogue act by the SLU module, which reflects that the generated response fully conveys the semantic information of the dialogue act. Based on this idea, a novel Dual Supervised Pre-trained Model for a few-shot Natural Language Generation (DSPM-NLG) is proposed to regularize the pre-training process. We adopt a joint model with a dual supervised framework to learn the dual correlation between NLG and SLU from the perspective of probability. In addition, a slot-masked strategy is designed to enable the model to focus better on the key slot-value pairs. DSPM-NLG is continuously trained on existing public large-scale annotated data, which thoroughly learns the duality between two tasks to enhance the semantically controlling and generalization abilities of the pre-trained model. Experiments demonstrate that our proposed model performs outstandingly on the few-shot benchmark dataset and outperforms the previous SOTA results.
License type:
Attribution 4.0 International (CC BY 4.0)
Funding Info:
This research is supported by core funding from: I2R
Grant Reference no. : CR-2021- 001
Description:
ISSN: