Weak-Link Optimization for Multi-Agent Reasoning and Collaboration

arXiv cs.AI / 4/20/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that multi-agent LLM frameworks can become unstable because errors from weak agents are amplified during collaboration.
  • It introduces WORC (weak-link optimization), a two-stage method to first localize the “weak agent” using meta-learning weight prediction from task features.
  • WORC then improves performance by reallocating reasoning budgets based on predicted weakness, giving weak agents larger uncertainty-driven repeated-sampling quotas to boost reliability.
  • Experiments report 82.2% average accuracy on reasoning benchmarks, along with better framework stability and cross-architecture generalization, suggesting robustness comes from compensating weak links rather than only strengthening strong agents.

Abstract

LLM-driven multi-agent frameworks address complex reasoning tasks through multi-role collaboration. However, existing approaches often suffer from reasoning instability, where individual agent errors are amplified through collaboration, undermining overall performance. Current research mainly focuses on enhancing high-capability agents or suppressing unreliable outputs to improve framework effectiveness, while systematic identification and reinforcement of performance-limiting agents receive less attention. To address this gap, we propose WORC, a \underline{w}eak-link \underline{o}ptimization framework for multi-agent \underline{r}easoning and \underline{c}ollaboration, grounded in the weak-link principle. WORC follows a two-stage workflow. In the weak agent localization stage, task features are constructed, and a meta-learning-based weight predictor trained on optimal configurations identified by swarm intelligence algorithms (SIAs) enables zero-shot mapping from these features to agent performance weights, where the agent with the lowest predicted weight is identified as the weak agent. In the weak-link optimization stage, an uncertainty-driven allocation strategy assigns additional reasoning budgets to weak agents, with lower predicted weights leading to larger repeated-sampling quotas to compensate for reliability deficiencies. Experimental results show that WORC achieves an average accuracy of 82.2\% on reasoning benchmarks while improving framework stability and cross-architecture generalization, suggesting that compensating for weak links, rather than reinforcing strengths alone, enhances the robustness of multi-agent systems.