Stabilizing multi-agent loops on local LLMs (supervisor + skeptic issues)

Reddit r/LocalLLaMA / 3/25/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical Usage

共有:

Key Points

The author describes a locally run multi-agent loop composed of a supervisor, search agent, code (Python in a Docker sandbox) agent, analysis agent, and a skeptic/invalidation agent to move beyond one-shot answers.
Early findings suggest the setup performs better on research-style tasks that lean on code and reasoning more than on heavy web search.
Key instability issues include the supervisor getting stuck in “doubt loops,” sometimes exiting too early with weak answers, and the skeptic agent being overweighted and causing unnecessary rework.
The author notes that agent routing is highly sensitive to prompts, which affects overall reliability and consistency.
They share a related GitHub repo and ask the community for suggestions on improving pipelines or agent designs to stabilize local multi-agent workflows.

I’ve been experimenting with a multi-agent loop locally to see how far smaller models can go beyond one-shot answers.

Not a new big idea, lots of similar setups lately. Just sharing my own results since I’m building this solo and trying to compare notes.

Setup is roughly:

What’s interesting so far:

It actually works better on research-style tasks where the system relies more on code + reasoning, and less on heavy web search.

But there are still some rough edges:

So overall: decent results, but not very stable yet.

Repo if anyone wants to dig into it:

So, I wonder if there are any improvement/development options, in terms of pipelines or agents?

Dev.to

Dev.to

Dev.to

Dev.to

Reddit r/artificial