Stabilizing multi-agent loops on local LLMs (supervisor + skeptic issues)

Reddit r/LocalLLaMA / 3/25/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical Usage

Key Points

  • The author describes a locally run multi-agent loop composed of a supervisor, search agent, code (Python in a Docker sandbox) agent, analysis agent, and a skeptic/invalidation agent to move beyond one-shot answers.
  • Early findings suggest the setup performs better on research-style tasks that lean on code and reasoning more than on heavy web search.
  • Key instability issues include the supervisor getting stuck in “doubt loops,” sometimes exiting too early with weak answers, and the skeptic agent being overweighted and causing unnecessary rework.
  • The author notes that agent routing is highly sensitive to prompts, which affects overall reliability and consistency.
  • They share a related GitHub repo and ask the community for suggestions on improving pipelines or agent designs to stabilize local multi-agent workflows.

Hey r/LocalLLaMA,

I’ve been experimenting with a multi-agent loop locally to see how far smaller models can go beyond one-shot answers.

Not a new big idea, lots of similar setups lately. Just sharing my own results since I’m building this solo and trying to compare notes.

Setup is roughly:

  • supervisor (decides which agent runs next)
  • search agent (DDG / arXiv / wiki)
  • code agent (runs Python in a Docker sandbox)
  • analysis agent
  • skeptic agent (tries to invalidate results)

What’s interesting so far:

It actually works better on research-style tasks where the system relies more on code + reasoning, and less on heavy web search.

But there are still some rough edges:

  • supervisor can get stuck in “doubt loops” and keep routing
  • sometimes it exits too early with a weak answer
  • skeptic can be overweighted -> unnecessary rework
  • routing in general is quite sensitive to prompts

So overall: decent results, but not very stable yet.

Repo if anyone wants to dig into it:

https://github.com/Evidion-AI/EvidionAI

So, I wonder if there are any improvement/development options, in terms of pipelines or agents?

submitted by /u/Top-Composer7331
[link] [comments]