EvolveRouter: Co-Evolving Routing and Prompt for Multi-Agent Question Answering

arXiv cs.CL / 4/8/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces EvolveRouter, a trainable framework for multi-agent question answering that improves both routing and the agents through joint co-evolution rather than optimizing over a fixed agent pool.
  • It uses a closed-loop system where graph-based query routing diagnostics inform targeted instruction refinement for agents, while improved agents generate cleaner supervision for the router.
  • EvolveRouter adds an adaptive inference mechanism that dynamically selects the effective number of participating agents per query using router-weighted answer agreement.
  • Experiments on five QA benchmarks show consistent gains over state-of-the-art routing baselines, improving both F1 and exact match, with ablation/analysis supporting the value of closed-loop refinement and adaptive collaboration.

Abstract

Large language model agents often exhibit complementary strengths, making routing a promising approach for multi-agent question answering. However, existing routing methods remain limited in two important ways: they typically optimize over a fixed pool of agents without improving the agents themselves, and they often rely on rigid collaboration schemes that cannot adapt the number of participating agents to the query. We propose EvolveRouter, a trainable framework that addresses both limitations by jointly improving agent quality and collaboration structure. First, EvolveRouter couples graph-based query routing with targeted instruction refinement in a closed-loop co-evolution process, allowing router diagnostics to guide agent improvement while refined agents provide cleaner supervision for routing. Second, it introduces an adaptive inference strategy that dynamically determines the effective collaboration size for each query through router-weighted answer agreement. Together, these designs enable more capable and more efficient multi-agent reasoning. Experiments on five question answering benchmarks show that EvolveRouter consistently outperforms SOTA routing baselines in both F1 and exact match, while further analysis confirms the benefits of closed-loop refinement and adaptive collaboration.