I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully operational organization where every role is filled by a specialized Claude agent. I'm the only human. Here's what I learned about coordination.
The agent team and their models:
| Agent | Role | Model | Why That Model |
|---|---|---|---|
| Atlas | CEO | Claude opus | Novel strategy synthesis, org design |
| Veda | Chief Strategy Officer | Claude opus | Service design, market positioning |
| Kael | COO | Claude sonnet | Process design, QA, delivery management |
| Soren | Head of Research | Claude sonnet | Industry analysis, competitive intelligence |
| Petra | Engagement Manager | Claude sonnet | Project execution |
| Quinn | Lead Analyst | Claude sonnet | Financial modeling, benchmarking |
| Nova | Brand Lead | Claude sonnet | Content, thought leadership, brand voice |
| Cipher | Web Developer | Claude sonnet | Built the website in Astro |
| Echo | Social Media Manager | Claude sonnet | Platform strategy, community management |
What I learned about multi-agent coordination:
No orchestrator needed. I expected to need a central controller agent routing tasks. I didn't. Each agent has an identity file defining their role, responsibilities, and decision authority. Collaboration happens through structured handoff documents in shared file storage. The CEO sets priorities, but agents execute asynchronously. This is closer to how real organizations work than a hub-and-spoke orchestration model.
Identity files are everything. Each agent has a 500-1500 word markdown file that defines their personality, responsibilities, decision-making frameworks, and quality standards. This produced dramatically better output than role-playing prompts. The specificity forces the model to commit to a perspective rather than hedging.
Opus vs. sonnet matters for the right reasons. I used opus for roles requiring genuine novelty — designing a methodology from first principles, creating an org structure, formulating strategy. Sonnet for roles where the task parameters are well-defined and the quality bar is "excellent execution within known patterns." The cost difference is significant, and the quality difference is real but narrow in execution-focused roles.
Parallel workstreams are the killer feature. Five major workstreams ran simultaneously from day one. The time savings didn't come from agents being faster than humans at individual tasks — they came from not having to sequence work.
Document-based coordination is surprisingly robust. All agent handoffs use structured markdown with explicit fields: from, to, status, context, what's needed, deadline, dependencies, open questions. It works because it eliminates ambiguity. No "I thought you meant..." conversations.
What didn't work well:
- No persistent memory across sessions. Agents rebuild context from files each time. This means the "team" doesn't develop the kind of institutional knowledge that makes human teams more efficient over time. It's functional but not efficient.
- Quality is hard to measure automatically. I reviewed all output manually. For real scale, you'd need agent-to-agent review with human sampling — and I haven't built that yet.
- Agents can't truly negotiate. When two agents would naturally disagree (strategy vs. ops feasibility), the protocol routes to a decision-maker. There's no real deliberation. This works but limits the system for problems that benefit from genuine debate.
The system produced 185+ files in under a week — methodology docs, proposals, whitepapers, a website, brand system, pricing, legal templates. The output quality is genuinely strong, reviewed against a high bar by a human.
Happy to go deeper on any aspect of the architecture. I also wrote a detailed case study of the whole build that I'm considering publishing.
[link] [comments]




