🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

Dev.to / 4/28/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • PicoClaw is a single-binary, Go-based personal AI agent designed to run on $10-class hardware and use under 10MB of RAM, enabling fast boot and broad hardware compatibility.
  • The guide explains PicoClaw’s architecture and core concepts, including an agent loop/pipeline, mid-loop steering via injected messages, hierarchical sub-agents (SubTurn), and session handling with JSONL persistence.
  • It details practical engineering patterns such as rule-based model routing, a hook system for extensibility, and abstractions for chat channels (18+ platforms) and LLM providers (30+ models) behind unified interfaces.
  • The article covers building an agent with tools/skills and MCP integration, plus resource-efficiency techniques (notably the “<10MB secret”) and deployment strategies like cross-compilation and single-binary releases.
  • Overall, it functions as an actionable field guide for recreating a PicoClaw-style ultra-light agent from scratch in Go, alongside common pitfalls and recommended reading through the project’s source.

A comprehensive, actionable guide to the principles, techniques, and architecture behind sipeed/picoclaw — written so you can build a similar system from scratch.

Table of Contents

  1. 🧩 What PicoClaw Is and Why It Matters
  2. 🎯 Design Philosophy
  3. 🏗️ High-Level Architecture
  4. 🔄 Core Concept #1 — The Agent Loop & Pipeline
  5. 🕹️ Core Concept #2 — Steering (Mid-Loop Message Injection)
  6. 🤝 Core Concept #3 — SubTurn (Hierarchical Sub-Agents)
  7. 💾 Core Concept #4 — Sessions & JSONL Persistence
  8. 🧭 Core Concept #5 — Rule-Based Model Routing
  9. 🪝 Core Concept #6 — The Hook System
  10. 📡 Core Concept #7 — Channel Abstraction (18+ chat platforms)
  11. 🤖 Core Concept #8 — Provider Abstraction (30+ LLMs)
  12. 🛠️ Core Concept #9 — Tools, Skills, and MCP
  13. ⚡ Resource-Efficiency Techniques (the <10MB secret)
  14. 📦 Cross-Compilation & Single-Binary Deployment
  15. ⚙️ Reference Configuration Schema
  16. 🗺️ Step-by-Step: Build Your Own PicoClaw-Style Agent
  17. ⚠️ Common Pitfalls & Lessons Learned
  18. 📖 Recommended Reading Path Through the PicoClaw Source

1. 🧩 What PicoClaw Is and Why It Matters

PicoClaw is a single-binary, Go-based personal AI agent that runs in under 10 MB of RAM on $10-class hardware (RISC-V SBCs, Raspberry Pi Zero, MIPS routers, Android via Termux, even old NanoKVM boards). It is heavily inspired by NanoBot, but rewritten "self-bootstrapped" in Go, with ~95% of the code generated by an agent under human review.

What makes it remarkable is not that it talks to LLMs — that's easy — but that it does so while being:

Property PicoClaw Typical Python AI stack
Memory footprint < 10 MB 200 MB – 2 GB
Boot time < 1 s on 0.6 GHz CPU 5–30 s
Distribution One static binary venv + dozens of wheels
Architectures x86_64, ARM, ARM64, RISC-V, MIPS, LoongArch mostly x86_64/ARM64
Channels 18+ (Telegram, Discord, WeChat, Slack…) 1–2 typically
LLM providers 30+ via unified interface 1–3 SDK-locked

The product is not "a chatbot." It is a portable agent runtime with first-class support for tools, MCP, sub-agents, multi-channel messaging, and provider routing.

2. 🎯 Design Philosophy

These are the principles that drive every design decision. Internalize these first; the code will then make sense.

2.1 🪶 Lean by default, extensible by interface

Choose Go because it produces small, statically-linked binaries with tiny runtime overhead, no GIL, and predictable memory. Wrap every variable subsystem (LLM, channel, tool, hook, registry) behind an interface so a feature can be added without touching the core loop.

2.2 📦 One binary, every architecture

A user deploying to a $10 RISC-V board should not have to think about Docker, Python versions, or shared libraries. make build-all produces binaries for Linux/amd64, ARM, ARM64, RISC-V, MIPS LE, LoongArch, Darwin ARM64, Windows, and NetBSD from one tree.

2.3 💾 Append-first persistence (JSONL)

Sessions and memories are stored as JSON Lines files with a sidecar .meta.json. Append-only is crash-safe, debug-friendly (tail -f), and trivially shippable. Schema migration happens lazily on read.

2.4 🗂️ Promote routing data to first-class fields

Channels do not bury chatId, senderId, and messageId inside generic metadata maps. Those are typed fields on InboundMessage. Routing, sessions, and hooks all rely on this contract.

2.5 🔍 Capabilities are discovered, not hardcoded

Each channel optionally implements MediaSender, TypingCapable, ReactionCapable, MessageEditor, WebhookHandler, HealthChecker. The manager probes via type assertions. Adding a new platform never touches the manager.

2.6 💰 Cheap-first, escalate when necessary

A rule-based classifier scores each turn 0..1 (token count, code blocks, recent tool calls, attachments, depth). Below threshold the request goes to a cheap "light" model. Above it, the heavy model. This alone cuts API spend dramatically for chatty workloads.

2.7 👁️ Observe everything, intercept rarely

Five synchronous hook points (before_llm, after_llm, before_tool, after_tool, approve_tool) are enough. Everything else is read-only event observation through an EventBus. Hooks can be in-process Go code or external processes via JSON-RPC over stdio.

2.8 🕹️ The user can change their mind mid-run

Users issue corrections. The agent loop polls a per-session steering queue after every tool call. New messages are injected before the next LLM turn; remaining queued tools are skipped with a "Skipped due to queued user message" result so the model knows what didn't run.

3. 🏗️ High-Level Architecture

                       ┌────────────────────────────────────────────┐
 18+ Chat Channels ─►  │  pkg/channels  (per-platform sub-packages) │
   (Telegram,          │  ─ BaseChannel, capability interfaces       │
    Discord, …)        │  ─ Manager: rate-limit, split, retry        │
                       └──────────────────────┬─────────────────────┘
                                              │  InboundMessage
                                              ▼
                       ┌────────────────────────────────────────────┐
                       │  pkg/bus  (typed event bus, in/out ctx)    │
                       └──────────────────────┬─────────────────────┘
                                              ▼
                       ┌────────────────────────────────────────────┐
                       │  pkg/routing                                │
                       │  ─ Dispatch: which agent handles this?      │
                       │  ─ Classifier: complexity score 0..1        │
                       │  ─ Light/Heavy model decision               │
                       └──────────────────────┬─────────────────────┘
                                              ▼
                       ┌────────────────────────────────────────────┐
                       │  pkg/session                                │
                       │  ─ SessionScope (agent/channel/account/dim) │
                       │  ─ JSONL backend + .meta sidecar            │
                       │  ─ Canonical key sk_v1_<sha256> + aliases   │
                       └──────────────────────┬─────────────────────┘
                                              ▼
                       ┌────────────────────────────────────────────┐
                       │  pkg/agent  (the loop)                      │
                       │                                             │
                       │  pipeline_setup → pipeline_llm →            │
                       │  pipeline_execute (tools) → pipeline_finalize│
                       │                                             │
                       │  ┌──────────┐  ┌──────────┐  ┌──────────┐  │
                       │  │ steering │  │ subturn  │  │  hooks   │  │
                       │  └──────────┘  └──────────┘  └──────────┘  │
                       │                                             │
                       │       ▲                          ▲          │
                       │       │ tools                    │ MCP       │
                       └───────┼──────────────────────────┼──────────┘
                               │                          │
                       ┌───────┴────────┐         ┌───────┴────────┐
                       │  pkg/tools     │         │  pkg/mcp       │
                       │  fs / shell /  │         │  isolated      │
                       │  hardware /    │         │  command       │
                       │  search ...    │         │  transport     │
                       └────────────────┘         └────────────────┘

                       ┌────────────────────────────────────────────┐
                       │  pkg/providers (factory + facades)          │
                       │  anthropic / openai_compat / azure /        │
                       │  bedrock / oauth / cli ...                  │
                       │  cooldown · ratelimiter · fallback ·        │
                       │  error_classifier                           │
                       └────────────────────────────────────────────┘

Three top-level binaries are produced from cmd/:

  • picoclaw — the agent itself (CLI + headless server)
  • picoclaw-launcher-tui — terminal UI launcher
  • membench — internal memory benchmark used to keep the <10MB promise honest

4. 🔄 Core Concept #1 — The Agent Loop & Pipeline

The pkg/agent package is where everything converges. The loop is split into four pipeline stages, each in its own file:

File Stage Job
pipeline_setup.go Setup Build prompt, load session history, resolve model, mount hooks
pipeline_llm.go LLM Call Call provider, stream tokens, parse tool calls and thinking blocks
pipeline_execute.go Tool Execution Run tool calls (possibly in parallel), enforce approvals, record results
pipeline_finalize.go Finalize Persist session, emit events, send outbound message, close turn

Around the pipeline are cross-cutting modules:

  • turn_coord.go — owns the per-turn state machine, decides light vs. heavy model, chooses provider candidates.
  • turn_state.go / turn_context.go — typed turn-scoped state.
  • context_manager.go / context_budget.go / context_usage.go — keep the message window inside the model's token limit; trim oldest, summarize, or drop based on budget.
  • prompt.go / prompt_contributors.go / prompt_turn.go — composable prompt builders. Each contributor adds a slice (system identity, tool list, memory, time, channel context).
  • eventbus.go / events.go — fan-out of every meaningful event (tool_exec_start, llm_request, turn_finished, …) to observers.
  • registry.go — agent registry; definition.go describes one agent (name, system prompt, tool set, models, light candidates).

Actionable patterns to copy

  1. Make the loop a strict state machine, not a callback web. Each pipeline file exports a single function that takes and returns a turn state. Easier to test, to add tracing, and to inject hooks.
  2. Have the agent definition be plain data. A Definition struct (pkg/agent/definition.go) is a name + system prompt + tool allow-list + provider candidates + light candidates. Loading from YAML/JSON becomes trivial.
  3. Separate "what to send to the LLM" from "how to send it." Prompt contributors build the abstract message list; the provider facade (next section) maps it to vendor-specific JSON.
  4. Track usage at the turn level. context_usage.go keeps token-in/token-out per turn so you can enforce per-turn budget caps and emit metering events without parsing logs.

5. 🕹️ Core Concept #2 — Steering (Mid-Loop Message Injection)

"The user can correct the agent at any moment. Make that a first-class concern."

pkg/agent/steering.go (and agent_steering.go) implements a per-session FIFO queue that the loop polls at four checkpoints:

  1. Loop initialization (before first LLM call)
  2. After each tool completes
  3. After each non-tool LLM response
  4. Before turn finalization

If a queued message exists at any of those points:

  • Any remaining tool calls in the current LLM response are skipped, each receiving the synthetic result "Skipped due to queued user message." so the model still understands what did/didn't run.
  • The queued message is appended to the conversation as a new user turn.
  • The loop re-enters the LLM stage.

Why this matters

  • Side-effect safety. A user yelling "don't send that email" actually stops the email if the previous tool was something else.
  • Compute savings. A planned batch of three 3–4s tool calls is ~10s of work avoided.
  • Model awareness. Skipping is announced via a tool-result message so the model can adapt instead of repeating the same plan.

Modes & limits

agentLoop.SetSteeringMode(agent.SteeringOneAtATime) // default: pop one per check
agentLoop.SetSteeringMode(agent.SteeringAll)        // drain whole queue at once

Hard cap: MaxQueueSize = 10 messages per session. Overflow returns an error on manual Steer() and a warning when an inbound channel-bus drain triggers it.

Public API to copy

// External: inject a correction
err := agentLoop.Steer(providers.Message{
    Role:    "user",
    Content: "actually, focus on X instead",
})

// External: nudge an idle session to continue
resp, err := agentLoop.Continue(ctx, sessionKey, channel, chatID)

Implementation notes

  • The queue is scoped by canonical session key. Different chats never bleed into each other.
  • Media references (media://...) survive steering — they're resolved in the normal pipeline before the provider call.
  • Inbound messages for a session that already has an active turn are automatically enqueued as steering rather than starting a competing turn.

6. 🤝 Core Concept #3 — SubTurn (Hierarchical Sub-Agents)

Sub-agents are isolated nested loops spawned by a parent turn. Defined in pkg/agent/subturn.go.

Properties

Property Value
Max nesting depth 3
Max concurrent per parent 5 (semaphore-guarded, 30s timeout)
Default timeout 5 min (parent and child have independent timeouts)
Message buffer 50 messages per sub-turn (does not contaminate parent history)
Result delivery async via pendingResults channel (16-message buffer)
Cancellation hard abort cascades to children & grandchildren
Critical: true survives parent completion and continues in background

When the parent polls results

Same checkpoints as steering — before every LLM call, after every tool call, before finalize. This keeps result handling deterministic without polling threads.

Why context derives from context.Background(), not the parent's ctx

So that an independent timeout on a child does not surprise it when the parent finishes early. If you want cascading cancellation for a particular sub-turn, the parent calls cancel() explicitly.

Pattern to copy

// inside parent agent loop
result, err := agent.SpawnSubTurn(ctx, agent.SubTurnSpec{
    AgentDef:   "researcher",
    Goal:       "Find primary sources for claim X",
    Critical:   false,
    Timeout:    2 * time.Minute,
    MaxHistory: 50,
})

Pitfalls

  • Orphan results. If the parent finishes before the child, the result is dropped (with a telemetry event). Either mark the child Critical: true or await it explicitly.
  • Buffer overflow. With 5 concurrent subs and a 16-slot result buffer, bursty completions can overflow — design subs to emit a single final result, not progress updates.

7. 💾 Core Concept #4 — Sessions & JSONL Persistence

pkg/session answers two questions: which messages share a conversation? and how is that conversation stored durably?

7.1 🪪 SessionScope — the structured identity of a conversation

type SessionScope struct {
    Version    string            // ScopeVersionV1
    AgentID    string            // routed agent
    Channel    string            // normalized channel name ("telegram")
    Account    string            // bot/account identifier
    Dimensions []string          // active partition dims, e.g. ["chat"]
    Values     map[string]string // concrete dim values
}

Default dimension set is ["chat"] → "one shared conversation per chat unless a dispatch rule overrides it." A dispatch rule can promote topic or sender into the dimension set to split or merge conversations.

7.2 🔑 Two key formats

Format Example Purpose
Canonical sk_v1_<sha256> Stable, opaque, the source of truth
Legacy agent:main:direct:user123 Backward compat, resolved transparently

The JSONL backend resolves legacy aliases to canonical keys during reads and writes — so you can rename schemes without losing history.

7.3 📄 JSONL on disk

Per session:

  • <key>.jsonl — one providers.Message per line, append-only.
  • <key>.meta.json{ summary, created_at, updated_at, line_count, skip_offset, scope, aliases }.

Why two files: messages are append-only and crash-safe; metadata is overwritten under a per-shard mutex but small enough that a torn write is recoverable from the JSONL.

"Designed around append-first durability and stale-over-loss recovery."

7.4 📐 Allocator rules

The allocator turns inbound metadata into scope values:

  • space<space_type>:<space_id>
  • chat<chat_type>:<chat_id>
  • topictopic:<topic_id>
  • sender → canonicalized through identity-link mappings (so that a user's Telegram ID and Slack ID map to the same logical sender)

Special case: Telegram forum topics append /<topic_id> to chat values when topic is not an explicit dimension — preventing topic cross-talk by default.

7.5 ⚡ Concurrency

A 64-shard mutex array (hash key → shard) serializes per-session writes without keeping an unbounded mutex map. This is a small but important pattern: lock striping is essentially free and fixes 99% of session-store contention bugs.

7.6 🔀 Migration

On startup the system attempts to migrate legacy JSON sessions into JSONL. If migration fails, it falls back to the legacy SessionManager rather than crash-looping the agent.

Actionable patterns

  • Make session keys content-addressed (sha256 over a canonical scope signature) so renaming dimensions doesn't break history.
  • Sidecar metadata is far simpler than embedding a header line in the JSONL.
  • Lock striping > one big mutex > one mutex per session. 64 shards is a good default.

8. 🧭 Core Concept #5 — Rule-Based Model Routing

pkg/routing is a two-stage pipeline:

  1. Agent dispatchRouter picks which agent definition handles the message (rules over channel, sender, content, command-prefix, etc).
  2. Model routing — once an agent is chosen, the RuleClassifier decides whether to use the agent's primary (heavy) model or a globally-configured cheap light model.

8.1 ⚙️ Configuration

{
  "routing": {
    "enabled": true,
    "light_model": "gemini-2.0-flash",
    "threshold": 0.35
  }
}

8.2 🔬 Features extracted per turn

The classifier is intentionally language-agnostic (no keyword lists), using five structural features:

Feature What it measures
TokenEstimate Approximate token count (CJK-aware rune counting)
CodeBlockCount Number of fenced ` blocks in latest message
RecentToolCalls Tool invocations in the last 6 history entries
ConversationDepth Total history length
HasAttachments Media references or recognized file extensions

8.3 ⚖️ Weighted scoring (clamped to [0,1])

Signal Weight
Has attachments 1.00
Code block present 0.40
Tokens > 200 0.35
Recent tool calls > 3 0.25
Tokens > 50 0.15
Recent tool calls 1–3 0.10
Conversation depth > 10 0.10

With threshold 0.35, trivial chat stays cheap; code, attachments, or active tool use trigger heavy. Long plain prompts cross at the 200-token boundary.

8.4 🔌 Where it plugs in

pkg/agent/turn_coord.go swaps the candidate provider list to agent.LightCandidates when score < threshold; otherwise it uses the agent's primary candidate set unchanged. The agent doesn't know — it just receives a different ordered list of providers.

Pattern to copy

  • Routing rules are data, not code. Keep them in JSON. Hot-reload is then os.Stat + json.Unmarshal.
  • Each agent has both Candidates and LightCandidates — primary and cheap fallback chains. Routing only picks the chain; the fallback logic inside the chain is generic (next section).

9. 🪝 Core Concept #6 — The Hook System

Five synchronous hook points + arbitrary read-only observers. Defined in pkg/agent/hooks.go, hook_mount.go, hook_process.go.

9.1 🔗 The five synchronous points

Stage Allowed actions
before_llm continue · modify (rewrite request) · abort_turn · hard_abort
after_llm continue · modify (rewrite response)
before_tool continue · modify (rewrite args) · respond (skip exec, supply result) · deny_tool
after_tool continue · modify (rewrite tool result)
approve_tool allow / deny only

Everything else is observer-only events on the bus.

9.2 🔄 In-process vs out-of-process

In-process: Go function registered at startup. Zero serialization cost. Used for built-ins like rate-limit injectors, audit loggers, schema validators.

Out-of-process: any program speaking JSON-RPC over stdio. Spawned and supervised by HookManager. Use for Python ML reranking, secret scrubbers, external policy engines, even mocking tools during tests.

9.3 📡 JSON-RPC framing

`json
// Request from host → hook
{ "jsonrpc": "2.0", "id": 7, "method": "hook.before_tool", "params": { ... } }

// Hook → host
{ "jsonrpc": "2.0", "id": 7, "result": { "action": "respond", "result": "cached" } }

// Notification (one-way; observer events)
{ "jsonrpc": "2.0", "method": "hook.event", "params": {"Kind": "tool_exec_start"} }
`

Lifecycle: host calls hook.hello first to negotiate protocol version + capabilities.

9.4 ⚙️ Configuration shape

`json
{
"hooks": {
"enabled": true,
"observer_timeout_ms": 200,
"interceptor_timeout_ms": 5000,
"approval_timeout_ms": 30000,
"builtins": {
"audit_log": { "enabled": true, "priority": 10, "config": {} }
},
"processes": {
"policy_check": {
"enabled": true,
"priority": 100,
"transport": "stdio",
"command": ["python3", "/srv/policy.py"],
"env": { "POLICY_FILE": "/etc/policy.yml" },
"observe": ["tool_exec_start"],
"intercept": ["before_tool", "approve_tool"]
}
}
}
}
`

9.5 📋 Hook ordering

In-process first → then by priority ascending → then by name. Deterministic and easy to reason about.

What hooks are NOT for

  • Sending messages to channels themselves (use the bus).
  • Suspending a turn pending human approval (state machine externally).
  • Full message interception across all platforms (channel-level concern).

Patterns to copy

  • Make the hook protocol versioned (hook.hello). It saves a major refactor 18 months later.
  • Observers run with a strict timeout (e.g. 200ms). Slow observers degrade quietly into "skipped" instead of stalling turns.
  • respond action lets a hook fake tool output. Cache, mock, override — without touching the registry.

10. 📡 Core Concept #7 — Channel Abstraction (18+ chat platforms)

pkg/channels is the textbook example of capability-based polymorphism in Go.

10.1 📜 The contract

Every platform sub-package embeds BaseChannel (base.go) and implements the minimum interface. Each platform self-registers a factory in init():

`go
func init() {
channels.Register("telegram", New)
}
`

registry.go is the single source of truth; the manager never imports specific platforms.

10.2 🔌 Capability interfaces (optional)

`go
type MediaSender interface { SendMedia(...) error }
type TypingCapable interface { ShowTyping(...) error }
type ReactionCapable interface { React(...) error }
type PlaceholderCapable interface { SendPlaceholder(...) (id string, err error) }
type MessageEditor interface { Edit(...) error }
type WebhookHandler interface { HandleWebhook(http.ResponseWriter, *http.Request) }
type HealthChecker interface { Check(ctx context.Context) error }
`

The manager probes channels with if c, ok := ch.(MediaSender); ok { ... }. Adding VoiceCapable to one platform doesn't change anyone else.

10.3 🗂️ First-class fields, not metadata bags

InboundMessage (in pkg/bus) hoists routing data to typed fields:

`go
type InboundMessage struct {
Peer Peer // platform + chat + topic
MessageID string
Sender SenderInfo // canonical identity ("telegram:42")
Body string
Media []MediaRef
ReceivedAt time.Time
}
`

This is the contract that pkg/session.Allocator and pkg/routing.Router rely on. Put it in your design from day one — retrofitting is painful.

10.4 🎛️ Centralized orchestration in the manager

The manager (not the platform) owns:

  • Worker queue with rate limit per channel.
  • Outbound message splitting (split.go) — long replies are broken at sentence/word boundaries below the platform's per-message limit.
  • Retries with backoff on transient errors classified by errors.go / errutil.go.
  • Typing/reaction indicators as transparent decorations of long turns.

Platforms only know how to send a single chunk. Everything fancy happens above them.

10.5 🪪 Identity normalization

pkg/identity defines the canonical "platform:id" format and identity-link tables that collapse multi-platform users into one logical sender. This is what enables cross-channel memory and consistent routing.

Patterns to copy

  • Self-registration via blank-import side effects: the main binary just does _ "yourapp/channels/telegram" and the channel becomes available. No registry plumbing.
  • Capability interfaces beat optional methods on a god-interface. You will thank yourself when the 12th platform needs something weird.
  • Sentinel errors in errors.go so the manager can decide retry vs. drop without parsing strings.

11. 🤖 Core Concept #8 — Provider Abstraction (30+ LLMs)

pkg/providers is built around a factory + facade pattern.

11.1 📁 Layout

`plaintext
pkg/providers/
factory.go // registers and instantiates providers by name
factory_provider.go
cli_facade.go // unified for "CLI"-shaped providers
httpapi_facade.go // unified for HTTP-shaped providers
oauth_facade.go // unified for OAuth flows
cooldown.go // per-provider cool-down on auth/quota errors
ratelimiter.go // token-bucket per provider
fallback.go // chain-of-responsibility fallback to next candidate
error_classifier.go // network/auth/rate/server/unknown
types.go // Message, ContentBlock, ToolCall, Usage, …

anthropic/ // Anthropic Messages API
anthropic_messages/ // alt path, e.g. server-side tools
openai_compat/ // OpenAI + every API-compatible vendor
openai_responses_common/
azure/ // Azure OpenAI specifics
bedrock/ // AWS Bedrock
httpapi/ // generic HTTP fallback
oauth/ // device flows
cli/ // local CLI providers (Ollama-style)
common/ // shared message-utility helpers
messageutil/
protocoltypes/
`

11.2 🔌 The provider interface (conceptual)

A provider exposes:

  • Send(ctx, request) (response, error) (streaming via channel)
  • Capabilities() (tools? vision? thinking? context window? streaming?)
  • Name(), Model()

The agent loop never imports a specific provider — it picks one from a candidate list returned by the routing layer.

11.3 🛡️ Reliability stack (the part most projects miss)

When a provider call fails, the wrapper consults:

  1. error_classifier — Auth? Rate-limit? Network blip? 5xx?
  2. cooldown — if Auth/Quota, mark this provider unavailable for N minutes.
  3. ratelimiter — token bucket to keep us under contractual TPM/RPM.
  4. fallback — try next candidate in the chain (heavy → light, or primary → secondary key).

The agent never sees this — it sees one logical "send" that either returns a response or gives up after the chain is exhausted.

Patterns to copy

  • Provider config is protocol/model strings, e.g. "openai/gpt-5.4", "anthropic/claude-opus-4-7". Swap by editing config; no recompile.
  • Keep API keys in a separate .security.yml out of config.json. Different file permissions, easier to scrub in bug reports.
  • The classifier's job is to decide retry-or-not. Don't bake retry into each provider — it'll diverge.

12. 🛠️ Core Concept #9 — Tools, Skills, and MCP

Three layers of "things the agent can do beyond LLM calls":

12.1 🔧 Tools — built-in, in-process

pkg/tools/:

  • fs/ — read, write, list, glob.
  • shell.go (+ Unix/Windows variants) — process exec.
  • hardware/ — device interactions (USB, GPIO, camera; appropriate for SBCs).
  • integration/ — outbound HTTP, web search (DuckDuckGo, Brave, Tavily, Baidu).
  • shared/ — shared helpers used by multiple categories.
  • registry.go — registers tools; exposes Get(name), List(), schema.
  • toolloop.go — orchestrates tool execution within a single turn (parallel-safe, with approval hook integration).
  • search_tool.go — first-class tool selector for "find a tool that does X."
  • spawn.go / spawn_status.go — long-running child process management.

12.2 📚 Skills — installable plugins

pkg/skills/:

  • Two registry backends: clawhub_registry.go (custom hub), github_registry.go (any repo with the right manifest).
  • installer.go — fetch, verify, materialize on disk.
  • loader.go — load at runtime.
  • provider_factory.go — skills can ship with provider configurations.
  • search_cache.go — registry search results are cached.
  • config_bridge.go — skill config is merged into runtime config without leaking into the parent file.

A skill is essentially a packaged bundle of (tools | hooks | provider configs | prompts | docs) that can be installed by name and removed cleanly.

12.3 🔗 MCP — Model Context Protocol

pkg/mcp/:

  • manager.go — owns connections to MCP servers, exposes their tools/resources/prompts to the agent.
  • isolated_command_transport.gospawns each MCP server in an isolated process, talks JSON-RPC over stdio. Prevents one buggy server from crashing the agent.
  • manager_test.go — coverage.

agent_mcp.go (in pkg/agent) wires MCP-discovered tools into the per-turn tool list. From the model's perspective, an MCP tool and a built-in tool are indistinguishable.

Patterns to copy

  • Built-in tools stay tiny and audited. Anything ambitious (browser automation, payments) lives behind MCP or skills.
  • MCP transport isolation is non-negotiable. Treat MCP servers as untrusted child processes.
  • Tools have schema, descriptions, and approval flags as data, not Go conditionals. Re-using the tool registry for skills and MCP just becomes a matter of listing them.

13. ⚡ Resource-Efficiency Techniques (the <10MB secret)

Hitting <10 MB on a 0.6 GHz RISC-V is engineering, not magic. The techniques used:

13.1 🐹 Choice of Go

  • Static linking: no shared-library footprint.
  • No JIT/interpreter. No Python startup cost.
  • -ldflags="-s -w" strips the symbol table and DWARF info from the binary (~30% size reduction).
  • -trimpath removes file system paths.
  • UPX (optional) for additional compression on flash-poor boards.

13.2 🧵 Minimal goroutine surface

A typical concurrent system spawns thousands of goroutines. PicoClaw keeps it tight: one per active channel listener, one per active turn, one per running sub-turn (capped at 5×N), one per spawned hook process, one per MCP transport. Goroutines are cheap but each carries a stack — keep them counted.

13.3 🚧 Bounded queues everywhere

  • Steering queue: 10
  • SubTurn result buffer: 16
  • Concurrent SubTurns per parent: 5
  • Channel manager worker queue: per-platform configured

Bounded queues turn "memory bug" into "rejected request" — you can monitor and tune.

13.4 🌊 Streaming, not buffering

LLM responses are streamed token-by-token. Tool outputs from spawned processes are streamed line-by-line. Big responses never sit fully in memory.

13.5 📄 JSONL append-only persistence

Constant-memory writes; reads are line-iterators. No O(n) JSON object reload on every turn.

13.6 😴 Lazy initialization

Channels, hooks, and skill registries initialize only when enabled in config. Disabled subsystems contribute zero allocations.

13.7 📊 membench as a regression gate

cmd/membench is shipped in the repo: a synthetic workload that measures peak RSS. If a PR busts the budget, CI catches it.

13.8 🔧 Architecture-aware patches

For MIPS LE on Ingenic X2600 / NaN2008 kernels, the Makefile patches the ELF e_flags at offset 36 after building. Without this, the kernel rejects the binary. Lesson: cross-compilation is not done when the linker exits.

14. 📦 Cross-Compilation & Single-Binary Deployment

14.1 🔨 The build matrix (make build-all)

OS GOARCH Notes
linux amd64
linux arm (GOARM=7) Pi Zero 2 W (32-bit)
linux arm64 Pi Zero 2 W (64-bit), most modern SBCs
linux riscv64 LicheeRV-Nano, MaixCAM
linux mipsle post-build ELF flag patch for NaN2008 kernels
linux loong64 LoongArch
darwin arm64 Apple Silicon
windows amd64
netbsd amd64 / arm64

Specialized targets:

  • build-pi-zero → 32-bit + 64-bit Pi Zero 2 W bundle.
  • build-android-bundle → universal APK with JNI libs (the agent runs as a native service inside the APK).
  • build-whatsapp-native → adds the native WhatsApp bridge.
  • build-launcher / build-launcher-tui → web/TUI control panels.

14.2 🏷️ Version stamping

`shell
go build -ldflags "-s -w \
-X main.version=$(VERSION) \
-X main.commit=$(COMMIT) \
-X main.date=$(DATE)"
`

picoclaw --version then prints the stamped values — vital for triage.

14.3 🚀 Single-binary delivery

The launcher (web or TUI) is a tiny supervisor that:

  1. Detects platform, picks the right binary.
  2. Drops it into ~/.picoclaw/.
  3. Spawns it and proxies a local browser to http://localhost:18800 for configuration.

End user double-clicks the launcher; agent runs. No package manager, no Docker, no Python.

15. ⚙️ Reference Configuration Schema

Annotated subset of config.example.json:

`jsonc
{
// Default agent settings used when an agent doesn't override.
"defaults": {
"workspace": "~/.picoclaw/workspace",
"model_name": "openai/gpt-5.4",
"max_iterations": 25,
"max_input_tokens": 128000,
"max_output_tokens": 4096
},

// Provider candidates. API keys live in .security.yml, NOT here.
"models": [
{ "name": "openai/gpt-5.4", "endpoint": "https://api.openai.com/v1" },
{ "name": "anthropic/claude-opus-4-7", "endpoint": "https://api.anthropic.com" },
{ "name": "google/gemini-2.0-flash" },
{ "name": "ollama/qwen3", "endpoint": "http://localhost:11434" }
],

// Cheap-first routing.
"routing": {
"enabled": true,
"light_model": "google/gemini-2.0-flash",
"threshold": 0.35
},

// Per-channel config; most disabled by default.
"channels": {
"telegram": { "enabled": false, "token": "" },
"discord": { "enabled": false, "token": "" },
"slack": { "enabled": false, "bot_token": "", "app_token": "" },
"matrix": { "enabled": false },
"wechat": { "enabled": false }
},

// Tool surface.
"tools": {
"web_search": { "enabled": true, "providers": ["duckduckgo", "brave", "tavily"] },
"shell": { "enabled": true, "approval_required": true },
"fs": { "enabled": true, "root": "~/.picoclaw/workspace" },
"cron": { "enabled": true }
},

// External MCP servers, each isolated in its own process.
"mcp": {
"servers": {
"filesystem": { "command": ["mcp-server-fs"], "enabled": true }
}
},

// Skills marketplace.
"skills": {
"registries": {
"clawhub": { "enabled": true, "url": "https://hub.picoclaw.io" },
"github": { "enabled": true }
},
"installed": []
},

// Hooks: in-process built-ins + external processes.
"hooks": {
"enabled": true,
"observer_timeout_ms": 200,
"interceptor_timeout_ms": 5000,
"approval_timeout_ms": 30000,
"builtins": {
"audit_log": { "enabled": true, "priority": 10 }
},
"processes": {}
},

// Heartbeat for liveness reporting and autoscale signals.
"heartbeat": { "interval_seconds": 30 },

// Web UI gateway.
"gateway": { "host": "127.0.0.1", "port": 18800 }
}
`

Companion file:

`yaml

.security.yml -- separate file, separate permissions

openai:
api_key: sk-...
anthropic:
api_key: sk-ant-...
telegram:
token: 1234:ABC...
`

16. 🗺️ Step-by-Step: Build Your Own PicoClaw-Style Agent

A pragmatic 12-step roadmap. Each step yields a runnable artifact.

Step 1 — 🦴 Skeleton repo

`plaintext
yourapp/
cmd/yourapp/main.go # entry
pkg/
agent/
bus/
channels/
config/
providers/
routing/
session/
tools/
Makefile
config/config.example.json
.security.example.yml
`

main.go reads config, constructs a Manager, blocks on os.Signal. Nothing else yet.

Step 2 — 🚌 Typed message bus

Define InboundMessage and OutboundMessage with first-class Peer, Sender, MessageID. Build pkg/bus/bus.go as a fan-out dispatcher with bounded per-subscriber queues.

Step 3 — 📺 One channel: stdin/stdout

Implement a stdio channel that reads lines from stdin, emits InboundMessage, prints OutboundMessage. This is your dev harness — no Telegram tokens needed.

Step 4 — 🤖 One provider: OpenAI-compatible

Build the openai_compat provider. Make it streaming. Define a Provider interface with Send(ctx, req) (<-chan Chunk, error).

Step 5 — 🔄 Minimal agent loop

pkg/agent/pipeline_*.go. Setup → LLM → execute (no tools yet) → finalize. Hardcode a system prompt. End-to-end you should now type "hello" and get a streamed reply.

Step 6 — 💾 Sessions on JSONL

Build pkg/session with canonical keys, JSONL backend, .meta.json sidecar, 64-shard mutex. Now conversation persists across runs.

Step 7 — 🛠️ Tools registry

Implement pkg/tools/registry.go with Get, List, Schema(). Add two tools: fs.read and web.fetch. Wire pipeline_execute to call them on parsed tool calls.

Step 8 — 🕹️ Steering

Add per-session FIFO queue + four polling points. Test by sending a follow-up while the agent is running tools — it must skip remaining tools with the explicit "Skipped" tool result.

Step 9 — 🪝 Hooks

Define five hook points + observer events. Build in-process registration first; add JSON-RPC stdio process hooks once the in-process path is solid.

Step 10 — 🧭 Routing

Add pkg/routing classifier with the five features and weighted scoring. Add light_model to config. Verify cheap chat goes to the light model.

Step 11 — 📡 Second channel + capability interfaces

Add Telegram. Define MediaSender, TypingCapable, WebhookHandler capability interfaces. Move retries / splitting / rate-limit into manager.go. The Telegram channel itself should be ~200 lines.

Step 12 — 📦 Cross-compile & ship

`makefile
build-all:
\tGOOS=linux GOARCH=amd64 go build -ldflags="-s -w -trimpath" -o dist/yourapp-linux-amd64 ./cmd/yourapp
\tGOOS=linux GOARCH=arm GOARM=7 go build -ldflags="-s -w -trimpath" -o dist/yourapp-linux-armv7 ./cmd/yourapp
\tGOOS=linux GOARCH=arm64 go build -ldflags="-s -w -trimpath" -o dist/yourapp-linux-arm64 ./cmd/yourapp
\tGOOS=linux GOARCH=riscv64 go build -ldflags="-s -w -trimpath" -o dist/yourapp-linux-riscv64 ./cmd/yourapp
\tGOOS=linux GOARCH=mipsle GOMIPS=softfloat go build -ldflags="-s -w -trimpath" -o dist/yourapp-linux-mipsle ./cmd/yourapp
\tGOOS=darwin GOARCH=arm64 go build -ldflags="-s -w -trimpath" -o dist/yourapp-darwin-arm64 ./cmd/yourapp
`

Run du -h dist/* — single-digit MB binaries. Confirm with a membench run that peak RSS stays under your target (e.g. 10 MB).

Then add: SubTurns (Step 13), MCP (14), skills marketplace (15), web launcher (16), more channels (17–N).

17. ⚠️ Common Pitfalls & Lessons Learned

These are the traps either explicit in PicoClaw's docs or implied by its design choices.

Pitfall Mitigation
Goroutine leaks via unbounded fan-out Bounded queues + errgroup per scope (turn, session, channel).
Cross-channel memory crosstalk Canonical session key from sha256(scope) — never concatenate strings.
Forum/topic chats merging into one conversation Append /<topic_id> to chat values when topic isn't an explicit dimension.
Tool side effects after a user correction Skip remaining tools on steering arrival; emit explicit skip results.
Orphan SubTurn results crashing parent 16-slot result buffer + Critical: true for must-finish work.
context.Background() vs parent ctx confusion Document explicitly in your SubTurn API; default to independent timeouts.
API keys in plaintext config Two files: config.json + .security.yml with stricter perms.
Memory regressions slipping in Ship membench and gate it in CI.
MIPS LE binaries refused by kernel Patch ELF e_flags at offset 36 after build.
Hooks blocking turns Per-class timeouts: observer 200ms, interceptor 5s, approval 30s.
Rebuilding when adding a provider Provider config is protocol/model strings; factory dispatches at runtime.
Schema drift between sessions Lazy migration in JSONL backend; never edit applied "migrations" — append new ones.
Routing rules buried in code Routing is data — JSON rules + features. Hot-reload friendly.
30 channels each duplicating retry logic Centralize retry/split/rate-limit in manager.go; channels send a single chunk.
MCP server bug killing the agent Spawn each MCP server in an isolated process via isolated_command_transport.
One mutex around the session store 64-shard mutex array on hash(key).

18. 📖 Recommended Reading Path Through the PicoClaw Source

If you read these files in this order, the architecture clicks fast:

  1. cmd/picoclaw/main.go — the boot sequence.
  2. pkg/bus/types.go — the typed message contract that flows through the whole system.
  3. pkg/agent/definition.go — what an agent is as data.
  4. pkg/agent/pipeline.gopipeline_setup.gopipeline_llm.gopipeline_execute.gopipeline_finalize.go — the loop.
  5. pkg/agent/turn_coord.go — the brains tying routing, providers, and steering together.
  6. pkg/agent/steering.go — the most copy-worthy single concept in the project.
  7. pkg/agent/subturn.go — sub-agent semantics.
  8. pkg/session/manager.go + jsonl_backend.go + allocator.go — durable state.
  9. pkg/routing/router.go + classifier.go + features.go — cheap-first routing.
  10. pkg/agent/hooks.go + hook_mount.go + hook_process.go — extensibility.
  11. pkg/channels/manager.go + base.go + interfaces.go — channel abstraction.
  12. pkg/providers/factory.go + cooldown.go + fallback.go + error_classifier.go — provider reliability stack.
  13. pkg/tools/registry.go + toolloop.go — tool execution.
  14. pkg/mcp/manager.go + isolated_command_transport.go — MCP integration.
  15. pkg/skills/registry.go + installer.go — plugin marketplace.
  16. Makefile — cross-compilation matrix, ELF patching, version stamping.
  17. docs/architecture/*.md — official narrative for steering, subturn, sessions, routing, hooks.

🎯 TL;DR — The Recipe in One Page

  1. Use Go. Static binaries, small RSS, uniform across architectures.
  2. Typed message bus with first-class Peer, Sender, MessageID.
  3. Pipelined agent loop: setup → LLM → tools → finalize, with a turn state struct.
  4. Steering: per-session FIFO queue polled at 4 checkpoints; skipped tools get explicit results.
  5. SubTurns with depth ≤ 3, concurrency ≤ 5, independent timeouts, Critical flag for must-finish.
  6. Sessions: structured SessionScope → canonical sk_v1_<sha256> key, JSONL + .meta.json, 64-shard locking.
  7. Routing: classifier with 5 structural features, weighted score, light_model below threshold.
  8. Hooks: 5 sync points + observer events, in-process or JSON-RPC over stdio, per-class timeouts.
  9. Channels: each in its own sub-package, embed BaseChannel, declare optional capabilities by interface, manager owns retries/splitting/rate-limit.
  10. Providers: factory + facades + cooldown + ratelimiter + fallback + error_classifier, configured by protocol/model strings, secrets in .security.yml.
  11. Tools / MCP / Skills: in-process tools for built-ins; MCP for untrusted external tools (isolated transport); skills as installable bundles from a registry.
  12. Bounded queues, streaming, lazy init, -ldflags="-s -w", -trimpath, membench regression gate.
  13. Cross-compile to amd64/arm/arm64/riscv64/mipsle + Darwin + Windows + NetBSD; patch MIPS ELF e_flags; ship a launcher that auto-picks the binary.

Build steps 1–12 from §16 in order, validate with the patterns in §17, and you have a PicoClaw-class agent.

If you found this helpful, let me know by leaving a 👍 or a comment!, or if you think this post could help someone, feel free to share it! Thank you very much! 😃