I spent three weeks rebuilding ls, grep, cat, find, stat, diff, and 16 other Unix coreutils in Go. Not because the originals are broken — they're masterpieces of systems programming that have survived decades of use. I rebuilt them because AI coding agents are terrible at reading their output.
The Problem Nobody Talked About
Every time an AI agent runs ls src/, it receives something like this:
-rw-r--r-- 1 user staff 2048 Apr 6 12:00 main.go
drwxr-xr-x 3 user staff 96 Apr 6 11:00 internal
lrwxr-xr-x 1 user staff 12 Apr 6 10:00 link -> main.go
The agent has to figure out which column is the filename. Which is the size. Whether that d at the start means directory. Whether Apr 6 means this year or last year. It guesses. Sometimes it guesses wrong. And every wrong guess costs tokens, introduces errors, and degrades the quality of the code it writes.
Now multiply that by every grep, every cat, every find the agent runs in a single session. The token waste is staggering. The parsing fragility is a constant source of subtle bugs.
The Insight
AI agents don't need pretty terminal output. They need structured data. They need to know that main.go is 2048 bytes, was modified 3600 seconds ago, is written in Go, has MIME type text/x-go, and is not a binary file. They need this information labeled, unambiguous, and machine-readable.
So I asked: what if ls returned XML?
<ls timestamp="1712404800" total_entries="3">
<file name="main.go" path="src/main.go" absolute="/project/src/main.go"
size_bytes="2048" size_human="2.0 KiB"
modified="1712404800" modified_ago_s="3600"
language="go" mime="text/x-go" binary="false"/>
<directory name="internal" path="src/internal"/>
<symlink name="link" target="main.go" broken="false"/>
</ls>
Zero ambiguity. Zero parsing. The agent reads the attributes and knows exactly what it's looking at. No regex to extract filenames from column-aligned text. No heuristic to determine if something is a directory. No guesswork.
Why XML and Not JSON?
Good question. JSON is the lingua franca of APIs. But XML has a structural advantage that matters for AI context windows: attributes.
Compare these two representations of the same file:
<file size_bytes="2048" language="go" mime="text/x-go"/>
{"size_bytes": 2048, "language": "go", "mime": "text/x-go"}
The XML version is 40 characters. The JSON version is 60. That's a 33% difference. When you're listing 1,000 files, that's tens of thousands of tokens saved. AI context windows are expensive and limited. Every character counts.
That said, aict supports --json for every tool. The schema is identical. Use whatever your pipeline prefers.
Why Go?
Three reasons:
Single binary. Go compiles to a static binary with zero runtime dependencies. aict is one file you drop on a system and it works. No pip install, no npm install, no shared libraries to manage. For a tool that's supposed to replace coreutils, this is non-negotiable.
Standard library only. Every feature — regex matching, MIME detection, filesystem walking, XML encoding — uses Go's standard library. Zero external dependencies means zero supply chain risk, zero version conflicts, and the ability to audit the entire codebase in an afternoon.
Performance is good enough. Yes, aict grep is slower than ripgrep. Yes, aict ls is slower than eza. But we're talking 15ms vs 2ms for listing 1,000 files. The overhead comes from language detection, MIME sniffing, and structured output — features that are the entire point of the project. For normal codebases, the difference is imperceptible.
What I Built
Twenty-two tools across five categories:
-
File inspection:
cat,head,tail,file,stat,wc -
Directory & search:
ls,find,grep,diff -
Path utilities:
realpath,basename,dirname,pwd -
Text processing:
sort,uniq,cut,tr -
System & environment:
env,system,ps,df,du,checksums
Plus a git subcommand suite (status, diff, log, ls-files, blame) and an MCP server that exposes every tool as a callable function to AI assistants like Claude and Cursor.
Every tool supports three output modes: XML, JSON, and plain text. Every error is structured data in stdout, never stderr. Every path is absolute. Every timestamp is a Unix epoch integer.
The MCP Server
This is where it gets interesting. aict ships with an MCP (Model Context Protocol) server binary called aict-mcp. You configure it in Claude Desktop or Cursor, and suddenly every tool becomes a typed, callable function.
The AI agent doesn't shell out to run aict ls src/. It calls the ls function with {path: "src/"} and receives structured JSON. No shell spawning. No output parsing. No ambiguity.
This is the future of how AI agents interact with filesystems. Not by typing commands into a terminal and reading the output like a human would. By calling typed functions and receiving typed responses.
What I Didn't Build
I intentionally excluded write operations: cp, mv, rm, mkdir, chmod, chown. These are dangerous when called by AI agents without human confirmation. aict is a read-only tool. It observes, it doesn't modify.
I also didn't try to match GNU coreutils flag-for-flag. Where a flag made sense for the AI use case, I added it. Where it didn't, I skipped it. The goal is not compatibility — it's utility for AI agents.
The Honest Benchmark
I benchmarked aict against GNU coreutils. Here are the results:
| Tool | GNU | aict | Ratio |
|---|---|---|---|
ls (1,000 files) |
~2ms | ~15ms | 7× |
grep (100k lines) |
~1ms | ~100ms | 100× |
find (deep tree) |
~2ms | ~9ms | 5× |
cat (100k lines) |
~1ms | ~23ms | 17× |
diff (1,000 lines) |
~1ms | ~10ms | 10× |
grep and cat are slow because every file is MIME-typed and language-detected. Use --plain to skip enrichment when you only need content. The trade-off is intentional: more tokens spent on parsing vs. more semantic information returned.
Is This Actually Useful?
I've been using aict with Claude and Cursor for two months. The difference is noticeable. The agent makes fewer mistakes about file types. It doesn't confuse directories with files. It correctly identifies binary files before trying to read them. It understands the structure of a codebase faster.
The token savings are real. A directory listing that used to cost 2,000 tokens in plain text now costs 800 in XML with three times the information density. Over a typical coding session with dozens of tool calls, that adds up.
Open Source
The project is MIT licensed and on GitHub. It's written in Go with zero external dependencies. You can audit the entire codebase in an afternoon. I'd love contributions — new tools, performance improvements, bug fixes.
If you build AI agents that interact with codebases, give it a try. Your agent will thank you. And if it doesn't work for your use case, that's fine too. GNU coreutils aren't going anywhere.
The repo is at github.com/synseqack/aict.




