Care-Conditioned Neuromodulation for Autonomy-Preserving Supportive Dialogue Agents

arXiv cs.LG / 4/3/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that LLM supportive/advisory agents need explicit alignment against relational risks that reinforce dependency, overprotection, or coercive guidance, not just general helpfulness/harmlessness.
It introduces Care-Conditioned Neuromodulation (CCN), a state-dependent control approach that uses a learned scalar signal from user state and dialogue context to condition response generation and candidate selection.
The authors formalize autonomy-preserving alignment as a multi-objective utility problem that rewards autonomy support and helpfulness while penalizing dependency and coercion.
They create a benchmark covering reassurance dependence, manipulative care, overprotection, and boundary inconsistency, and show that CCN-style candidate generation plus utility-based reranking improves autonomy-preserving utility versus supervised fine-tuning and preference-optimization baselines.
Pilot human evaluation and zero-shot transfer to real emotional-support conversations align directionally with automated metrics, suggesting the method is a practical route for autonomy-sensitive dialogue control.

Abstract

Large language models deployed in supportive or advisory roles must balance helpfulness with preservation of user autonomy, yet standard alignment methods primarily optimize for helpfulness and harmlessness without explicitly modeling relational risks such as dependency reinforcement, overprotection, or coercive guidance. We introduce Care-Conditioned Neuromodulation (CCN), a state-dependent control framework in which a learned scalar signal derived from structured user state and dialogue context conditions response generation and candidate selection. We formalize this setting as an autonomy-preserving alignment problem and define a utility function that rewards autonomy support and helpfulness while penalizing dependency and coercion. We also construct a benchmark of relational failure modes in multi-turn dialogue, including reassurance dependence, manipulative care, overprotection, and boundary inconsistency. On this benchmark, care-conditioned candidate generation combined with utility-based reranking improves autonomy-preserving utility by +0.25 over supervised fine-tuning and +0.07 over preference optimization baselines while maintaining comparable supportiveness. Pilot human evaluation and zero-shot transfer to real emotional-support conversations show directional agreement with automated metrics. These results suggest that state-dependent control combined with utility-based selection is a practical approach to multi-objective alignment in autonomy-sensitive dialogue.