Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data

Page view(s)

Checked on

Please use this identifier to cite or link to this item: https://oar.a-star.edu.sg/communities-collections/articles/22269

Title:

Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data

Journal Title:

The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)

DOI:

10.18653/v1/2025.findings-emnlp

Publication URL:

https://aclanthology.org/2025.findings-emnlp.760/

Authors:

Qiongqiong Wang, Hardik B. Sailor, Tianchi Liu, Wenyu Zhang, Muhammad Huzaifah, Nattadaporn Lertcheva, Shuo Sun, Nancy Chen, JINYANG WU, AiTi Aw

Keywords:

Publication Date:

05 November 2025

Citation:

Qiongqiong Wang, Hardik Bhupendra Sailor, Tianchi Liu, Wenyu Zhang, Muhammad Huzaifah, Nattadaporn Lertcheva, Shuo Sun, Nancy F. Chen, Jinyang Wu, and AiTi Aw. 2025. Benchmarking Contextual and Paralinguistic Reasoning in Speech-LLMs: A Case Study with In-the-Wild Data. In Findings of the Association for Computational Linguistics: EMNLP 2025

Abstract:

Recent speech-LLMs have shown impressive performance in tasks like transcription and translation, yet they remain limited in understanding the paralinguistic aspects of speech crucial for social and emotional intelligence. We propose CP-Bench, a benchmark for evaluating speechLLMs on contextual paralinguistic reasoning the integration of verbal content with non-verbal cues like emotion and prosody. The benchmark includes two curated question-answering (QA) datasets requiring both linguistic and empathetic understanding. We evaluate state-of-theart speech-LLMs from both open- and closedsource models and perform a comprehensive analysis across different question types. The top two models were further analyzed under temperature tuning to understand its effect on this task. Our benchmark reveals a key gap in existing evaluations and offers insights into building more context-aware and emotionally intelligent speech-capable LLMs.

License type:

Attribution 4.0 International (CC BY 4.0)

Funding Info:

This research / project is supported by the National Research Foundation, Singapore - National Large Language Models Funding
Grant Reference no. : EC-2024-021

Description:

URI:

https://oar.a-star.edu.sg/communities-collections/articles/22269

ISBN:

979-8-89176-335-7

Collections:

Institute for Infocomm Research

Files uploaded:

Manuscripts in This Item:

File	Size	Format	Action
2553.pdf	1.24 MB	PDF	Open