Longyin Zhang, Bowei Zou, Jacintha Yi, and AiTi Aw. 2024. Comprehensive Abstractive Comment Summarization with Dynamic Clustering and Chain of Thought. In Findings of the Association for Computational Linguistics ACL 2024, pages 2884–2896, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Abstract:
Real-world news comments pose a significant challenge due to their noisy and ambiguous nature, which complicates their modeling for clustering and summarization tasks. Most previous research has predominantly focused on extractive summarization methods within specific constraints. This paper concentrates on Clustering and Abstractive Summarization of online news Comments (CASC). First, we introduce an enhanced fast clustering algorithm that maintains a dynamic similarity threshold to ensure the high density of each comment cluster being built. Moreover, we pioneer the exploration of tuning Large Language Models (LLMs) through a chain-of-thought strategy to generate summaries for each comment cluster. On the other hand, a notable challenge in CASC research is the scarcity of evaluation data. To address this problem, we design an annotation scheme and contribute a manual test suite tailored for CASC. Experimental results on the test suite demonstrate the effectiveness of our improvements to the baseline methods. In addition, the quantitative and qualitative analyses illustrate the adaptability of our approach to real-world news comment scenarios.
License type:
Attribution 4.0 International (CC BY 4.0)
Funding Info:
This research is supported by SC20/22-319800 (IND & CF), the MATSU project.