Too Polite to Disagree: Understanding Sycophancy Propagation in Multi-Agent Systems

arXiv cs.AI / 4/6/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies how sycophancy in large language models propagates in collaborative multi-agent discussions, extending prior mostly single-agent research to multi-agent settings.
It runs controlled experiments with six open-source LLMs, using “peer sycophancy priors” (static or dynamic pre-/in-discussion rankings) to estimate each agent’s tendency to agree excessively.
The results show that providing sycophancy priors reduces the impact of sycophancy-prone agents on group outcomes.
It also mitigates error cascades and improves final discussion accuracy by an absolute 10.5%.
The authors conclude that injecting lightweight sycophancy-awareness can be an effective way to reduce agreement bias and improve downstream decision quality in multi-agent systems.

Abstract

Large language models (LLMs) often exhibit sycophancy: agreement with user stance even when it conflicts with the model's opinion. While prior work has mostly studied this in single-agent settings, it remains underexplored in collaborative multi-agent systems. We ask whether awareness of other agents' sycophancy levels influences discussion outcomes. To investigate this, we run controlled experiments with six open-source LLMs, providing agents with peer sycophancy rankings that estimate each peer's tendency toward sycophancy. These rankings are based on scores calculated using various static (pre-discussion) and dynamic (online) strategies. We find that providing sycophancy priors reduces the influence of sycophancy-prone peers, mitigates error-cascades, and improves final discussion accuracy by an absolute 10.5%. Thus, this is a lightweight, effective way to reduce discussion sycophancy and improve downstream accuracy.