Hey r/LocalLLaMA,
I’ve been experimenting with a multi-agent loop locally to see how far smaller models can go beyond one-shot answers.
Not a new big idea, lots of similar setups lately. Just sharing my own results since I’m building this solo and trying to compare notes.
Setup is roughly:
- supervisor (decides which agent runs next)
- search agent (DDG / arXiv / wiki)
- code agent (runs Python in a Docker sandbox)
- analysis agent
- skeptic agent (tries to invalidate results)
What’s interesting so far:
It actually works better on research-style tasks where the system relies more on code + reasoning, and less on heavy web search.
But there are still some rough edges:
- supervisor can get stuck in “doubt loops” and keep routing
- sometimes it exits too early with a weak answer
- skeptic can be overweighted -> unnecessary rework
- routing in general is quite sensitive to prompts
So overall: decent results, but not very stable yet.
Repo if anyone wants to dig into it:
https://github.com/Evidion-AI/EvidionAI
So, I wonder if there are any improvement/development options, in terms of pipelines or agents?
[link] [comments]
