Phase-Scheduled Multi-Agent Systems for Token-Efficient Coordination

arXiv cs.AI / 4/21/2026

📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research

Key Points

  • The paper introduces Phase-Scheduled Multi-Agent Systems (PSMAS) to address token inefficiency in LLM-based multi-agent systems caused by simultaneous (unstructured) activation and indiscriminate context sharing.
  • PSMAS assigns each agent a fixed phase on a circular attention manifold and uses a global sweep signal to activate only agents within a small angular window, while idle agents receive compressed context summaries.
  • Implemented on LangGraph and evaluated across four structured benchmarks and two conversational settings, PSMAS reduces tokens by 27.3% on average while keeping performance within 2.1 percentage points of a fully activated baseline.
  • The authors provide stability, convergence, and optimality results for the sweep dynamics, and show that phase scheduling alone contributes 18–20 percentage points of the token savings, largely independent of context compression quality (robust up to alpha = 0.40).

Abstract

Multi-agent systems (MAS) powered by large language models suffer from severe token inefficiency arising from two compounding sources: (i) unstructured parallel execution, where all agents activate simultaneously irrespective of input readiness; and (ii) unrestricted context sharing, where every agent receives the full accumulated context regardless of relevance. Existing mitigation strategies - static pruning, hierarchical decomposition, and learned routing - treat coordination as a structural allocation problem and fundamentally ignore its temporal dimension. We propose Phase-Scheduled Multi-Agent Systems (PSMAS), a framework that reconceptualizes agent activation as continuous control over a shared attention space modeled on a circular manifold. Each agent i is assigned a fixed angular phase theta_i in the range [0, 2*pi], derived from the task dependency topology; a global sweep signal phi(t) rotates at velocity omega, activating only agents within an angular window epsilon. Idle agents receive compressed context summaries, reducing per-step token consumption. We implement PSMAS on LangGraph, evaluate on four structured benchmarks (HotPotQA-MAS, HumanEval-MAS, ALFWorld-Multi, WebArena-Coord) and two unstructured conversational settings, and prove stability, convergence, and optimality results for the sweep dynamics. PSMAS achieves a mean token reduction of 27.3 percent (range 21.4-34.8 percent) while maintaining task performance within 2.1 percentage points of a fully activated baseline (p < 0.01, n = 500 per configuration), and outperforms the strongest learned routing baseline by 5.6 percentage points in token reduction with 2.0 percentage points less performance drop. Crucially, we show that scheduling and compression are independent sources of gain: scheduling alone accounts for 18-20 percentage points of reduction, robust to compression degradation up to alpha = 0.40.