Training-Free Agentic AI: Probabilistic Control and Coordination in Multi-Agent LLM Systems

arXiv cs.CL / 3/17/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

REDEREF is a lightweight, training-free controller that coordinates multi-agent LLM collaboration to improve routing efficiency during recursive delegation.
It combines belief-guided delegation with Thompson sampling to prioritize agents with historically positive marginal contributions, reflection-driven re-routing via a calibrated LLM or judge, and evidence-based selection rather than output averaging.
Across multi-agent split-knowledge tasks, REDEREF reduces token usage by 28%, agent calls by 17%, and time-to-success by 19% compared with random recursive delegation.
The method adapts gracefully under agent or judge degradation and does not require training or fine-tuning.

Abstract

Multi-agent large language model (LLM) systems enable complex, long-horizon reasoning by composing specialized agents, but practical deployment remains hindered by inefficient routing, noisy feedback, and high interaction cost. We introduce REDEREF, a lightweight and training-free controller for multi-agent LLM collaboration that improves routing efficiency during recursive delegation. REDEREF integrates (i) belief-guided delegation via Thompson sampling to prioritize agents with historically positive marginal contributions, (ii) reflection-driven re-routing using a calibrated LLM or programmatic judge, (iii) evidence-based selection rather than output averaging, and (iv) memory-aware priors to reduce cold-start inefficiency. Across multi-agent split-knowledge tasks, we show that while recursive retry alone saturates task success, belief-guided routing reduces token usage by 28%, agent calls by 17%, and time-to-success by 19% compared to random recursive delegation, and adapts gracefully under agent or judge degradation. These results demonstrate that simple, interpretable probabilistic control can meaningfully improve the efficiency and robustness of multi-agent LLM systems without training or fine-tuning.

I Built a Zombie Process Killer Because Claude Code Ate 14GB of My RAM

Dev.to

Data Augmentation Using GANs

Dev.to

Building Safety Guardrails for LLM Customer Service That Actually Work in Production

Dev.to

The New AI Agent Primitive: Why Policy Needs Its Own Language (And Why YAML and Rego Fall Short)

Dev.to

I came from Data Engineering stuff before jumping into LLM stuff, i am surprised that many people in this space never heard Elastic/OpenSearch

Reddit r/LocalLLaMA

Training-Free Agentic AI: Probabilistic Control and Coordination in Multi-Agent LLM Systems

Key Points

Abstract

Related Articles

I Built a Zombie Process Killer Because Claude Code Ate 14GB of My RAM

Data Augmentation Using GANs

Building Safety Guardrails for LLM Customer Service That Actually Work in Production

The New AI Agent Primitive: Why Policy Needs Its Own Language (And Why YAML and Rego Fall Short)

I came from Data Engineering stuff before jumping into LLM stuff, i am surprised that many people in this space never heard Elastic/OpenSearch

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer