You're already using an AI coding agent. Claude Code, Codex, Aider — pick one. It works. You give it a task, it writes code, you review it. Simple.
But you have a backlog. Five tasks that could run in parallel. You open three terminal tabs, paste prompts, and immediately discover why "just run more agents" doesn't scale.
Here's the progressive path from one agent to a supervised team. Each step adds capability without requiring you to rethink everything at once.
Stage 0: Where You Are Now
One agent. One terminal. Sequential tasks.
claude-code
# "Add JWT authentication"
# Wait 15 minutes
# Review, iterate
# "Now write the API tests"
# Wait 10 minutes
# Review, iterate
This works. The limitation is time — you're processing tasks sequentially when many of them could run in parallel.
When to move past this: You regularly have 3+ independent tasks queued, and you're spending more time waiting for agents than reviewing their output.
Stage 1: Two Agents in Parallel
The simplest upgrade. Two terminal tabs, two agents, two tasks.
# Terminal 1
mkdir -p .agents/agent-1
git worktree add .agents/agent-1 -b agent-1/task-auth
cd .agents/agent-1
claude-code
# "Add JWT authentication"
# Terminal 2
mkdir -p .agents/agent-2
git worktree add .agents/agent-2 -b agent-2/task-tests
cd .agents/agent-2
codex
# "Write integration tests for the user API"
Key change: Git worktrees. Each agent gets its own directory on its own branch. Without this, they overwrite each other's files.
What you manage manually: Checking when each agent finishes. Running tests. Merging branches one at a time. Resolving conflicts.
When to move past this: You're running 2 agents reliably, but the manual merge/test/check cycle is eating your supervision time. You find yourself asking "is agent 2 still working or stuck?"
Stage 2: Add tmux and Basic Monitoring
Replace terminal tabs with tmux. Now you can see both agents simultaneously and detach without killing them.
# Create a session with two panes
tmux new-session -s agents -d
tmux split-window -h -t agents
# Launch agents in their worktrees
tmux send-keys -t agents:0.0 'cd .agents/agent-1 && claude-code' Enter
tmux send-keys -t agents:0.1 'cd .agents/agent-2 && codex' Enter
# Attach and watch both
tmux attach -t agents
Key change: Visibility and persistence. You can see both agents working side by side. Close your laptop, SSH back in later, tmux attach — they're still running.
What you still manage manually: Everything from Stage 1, plus you're now building up tmux muscle memory for pane navigation.
When to move past this: You want 3+ agents, automated test checking, or the ability to walk away and come back to merged results.
Stage 3: Add Test Gating
Before this stage, "done" means the agent said it's done. After this stage, "done" means tests pass.
# After agent-1 says it's finished:
cd .agents/agent-1
cargo test # or npm test, pytest
echo $? # 0 = merge, non-zero = send back
# If tests pass:
cd /project
git merge agent-1/task-auth
# If tests fail:
# Copy the failure output back to the agent
# "Tests failed: thread 'test_jwt_auth' panicked..."
Key change: Quality gate. This single check reduces "agent broke something" incidents by roughly 80%. The remaining 20% are gaps in test coverage, not agent failures.
What you still manage manually: Running the test command, reading the output, deciding whether to merge or send feedback.
When to move past this: You're doing the test-run-merge cycle manually for every task and want it automated. You want an architect that decomposes features before engineers execute.
Stage 4: Add an Architect
Separate planning from execution. One agent decomposes work; others execute.
Until now, you've been the architect — deciding what each agent works on. An architect agent takes a high-level objective and breaks it into specific, testable tasks.
You → "Build user authentication with JWT"
Architect → Creates tasks:
1. Add JWT middleware to protected routes
2. Implement login endpoint with token generation
3. Implement token refresh endpoint
4. Write integration tests for auth flow
Each task is independent, specific, and has clear completion criteria. The difference in output quality between "build auth" and four decomposed tasks is dramatic.
Key change: Task decomposition quality. A good architect prompt produces better results than adding more engineers with vague tasks.
Stage 5: Automate with Batty
Every step above is something you can do manually. Batty automates the full loop:
# .batty/team_config/team.yaml
roles:
- name: architect
role_type: architect
agent: claude
instances: 1
talks_to: [manager]
- name: manager
role_type: manager
agent: claude
instances: 1
talks_to: [architect, engineer]
- name: engineer
role_type: engineer
agent: codex
instances: 3
use_worktrees: true
talks_to: [manager]
cargo install batty-cli
batty init --template standard
batty start --attach
batty send architect "Build user authentication with JWT"
What Batty automates:
- Worktree creation: persistent per-engineer, fresh branch per task
- Test gating: runs your test command before allowing merges
- Merge serialization: file lock prevents concurrent merge conflicts
- Task dispatch: Markdown kanban board with auto-assignment
-
Message routing: Maildir inboxes with
talks_toconstraints - Idle detection: 4-layer system (output hashing, session files, context exhaustion, completion packets)
- Agent lifecycle: spawn, monitor, restart on crash or context exhaustion
You supervise the team instead of operating the machinery.
Which Stage Are You?
| Stage | You're ready when... | Time to set up |
|---|---|---|
| 0 → 1 | You have 2+ independent tasks | 5 minutes |
| 1 → 2 | You want to see agents side by side | 10 minutes |
| 2 → 3 | You want quality gates before merging | 5 minutes |
| 3 → 4 | You want better task decomposition | 15 minutes |
| 4 → 5 | You want the full loop automated | cargo install batty-cli |
Start where you are. Each stage adds capability without requiring you to rebuild your workflow from scratch. Most developers find Stage 1 (worktrees) delivers immediate value — you can stay there for weeks before needing more.
The important thing isn't the tool. It's the progression: isolate work, gate on tests, decompose tasks, then automate.




