TCDA: Thread-Constrained Discourse-Aware Modeling for Conversational Sentiment Quadruple Analysis

arXiv cs.CL / 5/5/2026

📰 NewsModels & Research

Key Points

  • The paper targets Conversational Aspect-based Sentiment Quadruple Analysis (DiaASQ), which requires modeling nuanced relationships across multiple dialogue rounds.
  • It argues that prior approaches either add structural noise (GCN-based methods) or inadequately represent dialogue structure and timing (standard RoPE, including the “Distance Dilution” issue).
  • The proposed framework combines Thread-Constrained Directed Acyclic Graphs (TC-DAG) to suppress cross-thread noise while preserving global context via root anchoring, with Discourse-Aware RoPE (D-RoPE) to better reflect discourse progression.
  • D-RoPE uses dual-stream projection and multi-scale frequency signals, models thread dependencies via tree-like distances, and separates token-level syntactic order from utterance-level temporal progress.
  • Experiments on two benchmark datasets show the method achieves state-of-the-art results.

Abstract

Conversational Aspect-based Sentiment Quadruple Analysis (DiaASQ) needs to capture the complex interrelationships in multiple rounds of dialogues. Existing methods usually employ simple Graph Convolutional Networks (GCN), which introduce structural noise and fail to consider the temporal sequence of the dialogues, or use standard RoPE, which implicitly captures relative distances in a flat sequence but cannot clearly separate the token-level syntactic order from the utterance-level progression, and may suffer from the Distance Dilution problem. To address these issues, we propose a new framework that combines Thread-Constrained Directed Acyclic Graph (TC-DAG) and Discourse-Aware Rotary Position Embedding (D-RoPE). Specifically, TC-DAG filters out cross-thread noise based on thread constraints, maintains global connectivity through root anchoring, and incorporates the temporal sequence of the dialogues. D-RoPE aligns multi-layer semantics using dual-stream projection and multi-scale frequency signals, captures thread dependencies using tree-like distances, and alleviates the token-level Distance Dilution problem by incorporating utterance-level progressions. Experimental results on two benchmark datasets demonstrate that our framework achieves state-of-the-art performance.