SAGE: Multi-Agent Self-Evolution for LLM Reasoning
arXiv cs.AI / 3/17/2026
📰 NewsModels & Research
Key Points
- SAGE introduces a closed-loop multi-agent framework where four roles—Challenger, Planner, Solver, and Critic—co-evolve from a shared LLM backbone using only a small seed set.
- The Challenger generates progressively harder tasks, the Planner converts tasks into structured multi-step plans, the Solver executes the plan, and the Critic scores and filters outcomes to prevent curriculum drift and maintain signal quality.
- The method delivers consistent gains on math and code-generation benchmarks, with reported improvements of 8.9% on LiveCodeBench and 10.7% on OlympiadBench for the Qwen-2.5-7B model.
- By relying on self-training with verifiable rewards and external verifiers, SAGE reduces dependence on large labeled datasets while improving long-horizon reasoning stability.
Related Articles

I made a 'benchmark' where LLMs write code controlling units in a 1v1 RTS game.
Dev.to

My AI Does Not Have a Clock
Dev.to
How to settle on a coding LLM ? What parameters to watch out for ?
Reddit r/LocalLLaMA

Andrej Karpathy's autonomous AI research agent ran 700 experiments in 2 days and gave a glimpse of where AI is heading
Reddit r/artificial

So cursor admits that Kimi K2.5 is the best open source model
Reddit r/LocalLLaMA