Hey DEV community, CallmeMiho here. I spent my Monday morning watching a junior dev ship a Rube Goldberg machine powered by a credit card. Let's talk about why your AI agents are bankrupting you.
The task was simple: extract a specific itemUuid and a scimId from a massive dump of raw Activity Log data. Instead of engineering a solution, the dev just pipe-lined the raw, unformatted JSON slop—full of broken quotes, escaped characters, and mixed HTML tags—directly into a prompt and told the model to "find the ID."
The result was a textbook case of probabilistic vibration. Because the log was a disaster of escaped quotes (\") and malformed HTML, the agent hit an escaped character, hallucinated an end-of-file bracket that didn't exist, and entered an infinite recursion loop trying to re-parse the "rest" of the string.
It wasn't "reasoning"; it was a neural network tripping over its own feet because it was forced to be a parser. By the time I killed the process, the agent had burned through 50,000 tokens of high-tier compute just to find a single scimId.
That is not "AI Engineering"—it is technical debt with a monthly subscription.
The Hard Data: Measuring the Waste
If you aren't looking at the telemetry of your prompts, you aren't an architect; you’re a philanthropist for cloud providers. We call the delta between these two rows the Hype Tax.
| Metric | Raw Activity Log Data | Cleaned/Extracted JSON |
|---|---|---|
| Input Volume | 45,000 tokens | 150 tokens |
| Cost Per Call | $0.68 | $0.002 |
| Latency | 15s | 2s |
| Success Rate | 70% | 99.9% |
Paying a model to navigate 45,000 tokens of "garbage" formatting is a total failure of basic engineering discipline. When you feed an LLM unextracted noise, you aren't just wasting money—you are intentionally introducing non-determinism into a process that should be binary.
The Philosophy: LLMs Are Not Parsers
If a local Regex script or a JSON formatter can extract the signal for free, paying a model to do it is architectural waste.
High-performance engineering is grounded in a Secure by Design approach. Just as modern zero-knowledge architectures perform cryptographic operations locally to eliminate server-side risk, a professional AI integration must perform data extraction locally to eliminate token overhead. You should not trust a cloud LLM with raw data extraction; that belongs in the deterministic layer.
In a professional stack, we distinguish between Deterministic Logic (Regex, Zod) and Probabilistic Logic (LLMs). You don't use a billion-dollar neural network to find a DeviceKey in a text string; you use a parser. The LLM should only see the specific, non-sensitive identifiers required for the task after the local environment has done the heavy lifting.
The Fix: The Stage 1 Deterministic Pipeline
The only professional way to build AI integrations is a Two-Stage Agent Pipeline. Stage 1 is a local, deterministic cleanup phase. Before a single token is sent to a cloud provider, the data must pass through a local pipeline that strips the noise.
If you don't have a local pipeline, use offline utilities to build one:
- Regex Tester: Your first line of defense. Use a local script to pull the
itemUuidfrom a URL before the LLM ever sees the log entry. If the model is reading<div>tags to find a UUID, you have already failed. Build your regex patterns here. - JSON Formatter: Standardize and minify. This prevents the model from "vibrating" on broken quotes or escaped characters. A minified schema ensures the model focuses on the values rather than the syntax of the log. Format and minify your JSON here.
- Zod Schema Generator: Use this to enforce strict data contracts. Based on the Least Privilege principle, your Zod schema should programmatically strip fields before the prompt is assembled. If the model doesn't need to see it, Zod shouldn't pass it. Generate your Zod schemas here.
Conclusion: Engineering Discipline vs. Hype
High-performance engineering isn't about how much AI you use, but how little you need to use to get the job done. If you are sending unformatted log slop to a cloud model, you aren't building an agent; you are building a technical debt generator.
Stop subsidizing Sam Altman’s compute with your company’s technical debt. Clean your data or get out of the kitchen.
P.S. If you want to audit your token payloads before they bankrupt you, I built a suite of 100% offline, zero-server-log developer tools atFmtDev.dev.



