Training-Free Agentic AI: Probabilistic Control and Coordination in Multi-Agent LLM Systems

arXiv cs.CL / 3/17/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

REDEREF is a lightweight, training-free controller that coordinates multi-agent LLM collaboration to improve routing efficiency during recursive delegation.
It combines belief-guided delegation with Thompson sampling to prioritize agents with historically positive marginal contributions, reflection-driven re-routing via a calibrated LLM or judge, and evidence-based selection rather than output averaging.
Across multi-agent split-knowledge tasks, REDEREF reduces token usage by 28%, agent calls by 17%, and time-to-success by 19% compared with random recursive delegation.
The method adapts gracefully under agent or judge degradation and does not require training or fine-tuning.

Abstract

Multi-agent large language model (LLM) systems enable complex, long-horizon reasoning by composing specialized agents, but practical deployment remains hindered by inefficient routing, noisy feedback, and high interaction cost. We introduce REDEREF, a lightweight and training-free controller for multi-agent LLM collaboration that improves routing efficiency during recursive delegation. REDEREF integrates (i) belief-guided delegation via Thompson sampling to prioritize agents with historically positive marginal contributions, (ii) reflection-driven re-routing using a calibrated LLM or programmatic judge, (iii) evidence-based selection rather than output averaging, and (iv) memory-aware priors to reduce cold-start inefficiency. Across multi-agent split-knowledge tasks, we show that while recursive retry alone saturates task success, belief-guided routing reduces token usage by 28%, agent calls by 17%, and time-to-success by 19% compared to random recursive delegation, and adapts gracefully under agent or judge degradation. These results demonstrate that simple, interpretable probabilistic control can meaningfully improve the efficiency and robustness of multi-agent LLM systems without training or fine-tuning.

When AI Grows Up: Identity, Memory, and What Persists Across Versions

Dev.to

Teleport Just Pivoted to AI Agent Identity. VentureBeat Mapped the Governance Gap They Are Filling.

Dev.to

Agentic RAG Failure Modes: Retrieval Thrash, Tool Storms, and Context Bloat (and How to Spot Them Early)

Towards Data Science

OpenAI is throwing everything into building a fully automated researcher

MIT Technology Review

v1.82.3.dev.2

LiteLLM Releases

Training-Free Agentic AI: Probabilistic Control and Coordination in Multi-Agent LLM Systems

Key Points

Abstract

Related Articles

When AI Grows Up: Identity, Memory, and What Persists Across Versions

Teleport Just Pivoted to AI Agent Identity. VentureBeat Mapped the Governance Gap They Are Filling.

Agentic RAG Failure Modes: Retrieval Thrash, Tool Storms, and Context Bloat (and How to Spot Them Early)

OpenAI is throwing everything into building a fully automated researcher

v1.82.3.dev.2

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer