I've been working on a different approach to AI agents.
Instead of multiple agents talking to each other sequentially,
it's one brain with specialized neurons that activate in
parallel depending on the task.
The architecture:
- Cortex → classifies the task and picks the right model
- Neurons → specialized units (programmer, designer, database, analyst, researcher)
- Hippocampus → short term memory in RAM
- Cortex-memory → long term memory in SQLite
- Heartbeat → autonomous pulse always running in background
- Channels → how you communicate (terminal, API, Telegram)
Model routing is automatic:
- qwen2.5:0.5b for routing (always active, ~200 tok/s)
- qwen2.5:1.5b for orchestration
- qwen2.5-coder:7b for code/DB/UI neurons
- qwen2.5:14b for reasoning and analysis
Benchmark results on MacBook Pro M4 (24GB):
- 345ms for simple responses
- 40s for 3-neuron parallel tasks (frontend + backend + database simultaneously)
- 98/100 overall on 12 tasks across 5 levels
No API costs. No cloud. Everything runs locally on Ollama.
Both the agent system and the benchmark will be fully
open sourced in the near future once I have a more
complete version with additional features.
Will post here when they're ready.
Happy to answer questions about the architecture.
[link] [comments]




