STRIDE-ED: A Strategy-Grounded Stepwise Reasoning Framework for Empathetic Dialogue Systems

arXiv cs.CL / 4/9/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces STRIDE-ED, a strategy-grounded, interpretable, stepwise reasoning framework designed to improve empathetic dialogue by making response generation decisions conditioned on explicit strategies and context.
  • It proposes a strategy-aware data refinement pipeline that uses LLM-based annotation, consistency-weighted evaluation across multiple models, and dynamic sampling to build higher-quality training data aligned to empathetic strategies.
  • STRIDE-ED is trained via a two-stage process combining supervised fine-tuning with multi-objective reinforcement learning to better align outputs with target emotions, empathetic strategies, and response formats.
  • Experiments reportedly show STRIDE-ED generalizes across multiple open-source LLMs and outperforms prior methods on both automatic metrics and human evaluations.
  • The work frames empathetic dialogue as a multi-stage cognitive/decision-making problem rather than a single-step generation task, aiming to reduce limitations from missing comprehensive strategy frameworks and low-quality strategy-aware data.

Abstract

Empathetic dialogue requires not only recognizing a user's emotional state but also making strategy-aware, context-sensitive decisions throughout response generation. However, the lack of a comprehensive empathy strategy framework, explicit task-aligned multi-stage reasoning, and high-quality strategy-aware data fundamentally limits existing approaches, preventing them from effectively modeling empathetic dialogue as a complex, multi-stage cognitive and decision-making process. To address these challenges, we propose STRIDE-ED, a STRategy-grounded, Interpretable, and DEep reasoning framework that models Empathetic Dialogue through structured, strategy-conditioned reasoning. To support effective learning, we develop a strategy-aware data refinement pipeline integrating LLM-based annotation, multi-model consistency-weighted evaluation, and dynamic sampling to construct high-quality training data aligned with empathetic strategies. Furthermore, we adopt a two-stage training paradigm that combines supervised fine-tuning with multi-objective reinforcement learning to better align model behaviors with target emotions, empathetic strategies, and response formats. Extensive experiments demonstrate that STRIDE-ED generalizes across diverse open-source LLMs and consistently outperforms existing methods on both automatic metrics and human evaluations.