Discourse Diversity in Multi-Turn Empathic Dialogue
Hongli Zhan, Emma S. Gueorguieva, Javier Hernandez, Jina Suh, Desmond C. Ong + 1 more
TLDR
This paper introduces MINT, an RL framework that significantly improves discourse diversity and empathy in multi-turn LLM conversations by reducing tactic repetition.
Key contributions
- LLMs reuse discourse tactics in multi-turn empathic dialogue at nearly double the human rate (0.50-0.56 vs. 0.27).
- Introduces MINT, a novel reinforcement learning framework for optimizing discourse move diversity across multi-turn empathic dialogue.
- MINT combines an empathy quality reward with a cross-turn tactic novelty signal for training LLMs.
- MINT improves aggregate empathy by 25.3% and reduces cross-turn discourse move repetition by 26.3% over baselines.
Why it matters
LLMs struggle with varied discourse in multi-turn empathic conversations, leading to formulaic responses despite high single-turn empathy. This work offers a crucial solution, MINT, to make LLM-based emotional support more dynamic and human-like. It highlights that models need diversity, not just empathy.
Original Abstract
Large language models (LLMs) produce responses rated as highly empathic in single-turn settings (Ayers et al., 2023; Lee et al., 2024), yet they are also known to be formulaic generators that reuse the same lexical patterns, syntactic templates, and discourse structures across tasks (Jiang et al., 2025; Shaib et al., 2024; Namuduri et al., 2025). Less attention has been paid to whether this formulaicity extends to the level of discourse moves, i.e., what a response does for the person it is addressing. This question is especially consequential for empathic dialogue, where effective support demands not just a kind response at one moment but varied strategies as a conversation unfolds (Stiles et al., 1998). Indeed, prior work shows that LLMs reuse the same tactic sequences more than human supporters in single-turn settings (Gueorguieva et al., 2026). We extend this analysis to multi-turn conversations and find that the rigidity compounds: once a tactic appears in a supporter turn, LLMs reuse it in the next at nearly double the rate of humans (0.50-0.56 vs. 0.27). This pattern holds across LLMs serving as supporters in real emotional support conversations, and is invisible to standard similarity metrics. To address this gap, we introduce MINT (Multi-turn Inter-tactic Novelty Training), the first reinforcement learning framework to optimize discourse move diversity across multi-turn empathic dialogue. The best MINT variant combines an empathy quality reward with a cross-turn tactic novelty signal, improving aggregate empathy by 25.3% over vanilla across 1.7B and 4B models while reducing cross-turn discourse move repetition by 26.3% on the 4B model, surpassing all baselines including quality-only and token-level diversity methods on both measures. These results suggest that what current models lack is not empathy itself, but the ability to vary their discourse moves across a conversation.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.