Representational Collapse in Multi-Agent LLM Committees: Measurement and Diversity-Aware Consensus
arXiv cs.LG / 4/7/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- Multi-agent LLM “committees” that reuse the same underlying model with different role prompts can suffer from representational collapse, where agents’ chain-of-thought rationales become overly similar despite majority-vote aggregation.
- Using three Qwen2.5-14B agents on 100 GSM8K problems, the study finds high mean pairwise cosine similarity (0.888) and low effective rank (2.17/3), indicating reduced diversity among agents.
- The paper proposes DALC, a training-free diversity-aware consensus protocol that computes diversity weights from embedding-geometry, improving performance to 87% on GSM8K versus 84% for self-consistency while cutting token cost by 26%.
- Ablations show that hint sharing often matters more than diversity weighting alone, run-to-run variance can reach 1–3 points per protocol, and the embedding/encoder choice can materially change collapse severity and downstream accuracy.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.
Related Articles

Black Hat Asia
AI Business

Amazon CEO takes aim at Nvidia, Intel, Starlink, more in annual shareholder letter
TechCrunch

Why Anthropic’s new model has cybersecurity experts rattled
Reddit r/artificial
Does the AI 2027 paper still hold any legitimacy?
Reddit r/artificial

Why Most Productivity Systems Fail (And What to Do Instead)
Dev.to