I cut Claude Code's token usage by 68.5% by giving agents their own OS

Reddit r/artificial / 3/29/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage

共有:

Key Points

The article claims a 68.5% reduction in Claude Code token usage by introducing an “agentic JSON-native OS” that avoids wasteful human-oriented state checks and repeated context re-discovery on cold starts.
It reports scenario benchmarks showing large token savings for operations like semantic search vs grep/cat (91%), agent pickup vs cold log parsing (83%), and state polling vs multiple shell commands (57%).
The work is presented as fully reproducible via a provided benchmark script (python3 tools/bench_compare.py) and includes results across five real scenarios.
The OS is designed to integrate with Claude Code through MCP and to run local inference via Ollama, with the project licensed as MIT.
The author invites feedback from practitioners running agentic workflows to validate and iterate on the approach.

Al agents are running on infrastructure built for humans. Every state check runs 9 shell commands.

Every cold start re-discovers context from scratch.

It's wasteful by design.

An agentic JSON-native OS fixes it. Benchmarks across 5 real scenarios:

Semantic search vs grep + cat: 91% fewer tokens

Agent pickup vs cold log parsing: 83% fewer tokens

State polling vs shell commands: 57% fewer tokens

Overall: 68.5% reduction

Benchmark is fully reproducible: python3 tools/ bench_compare.py

Plugs into Claude Code via MCP, runs local inference through Ollama, MIT licensed.

Would love feedback from people actually running agentic workflows.