I started using AI APIs about a year ago for side projects I was hacking on in the evenings. Nothing production scale.
By month three I was running up about $80 a month in charges. Not wild, but when I broke it down, I was spending way more than I needed to. Half of what I was doing could have run on a cheap model for pennies. I was just lazy.
Here's what I actually changed:
First, I stopped using the flagship for everything. My defaults were Claude 3.5 Sonnet and GPT 4o. Both great. Both way overpowered for half of what I asked them.
I had a little utility that turned a messy chunk of text into a clean title. Take in a paragraph, return one sentence. I was using Sonnet at $3 input and $15 output per million tokens. For a task a much simpler model could handle.
Swapping that one call to Gemini 2.5 Flash Lite at $0.10 input and $0.40 output cut the per request cost by about 30x. Output quality was identical.
Rule I follow now. If the task is "transform this text a little," try a budget model first. Only reach for a flagship if the budget one actually fails.
Second, I cached and trimmed my system prompts. Every major provider offers prompt caching now. Anthropic gives you 90 percent off cached tokens. OpenAI does it automatically once your prompt goes over 1,024 tokens.
At 3,000 calls a month with a 600 token system prompt, that prompt alone was costing me $5.40 on Sonnet. With caching, 54 cents.
While I was in there, I actually read my prompt for the first time in months. It was a mess. "Please provide a response." "It would be helpful if you could." Polite costs tokens. I cut it from 600 to 300. Saves 50 percent on input forever.
Read your system prompt out loud. If it sounds like a cover letter, it's too long.
Third, I got tired of doing the math. For every new model I wanted to try, I was running the same spreadsheet. Input tokens times price per million. Output tokens times price per million. Add. Check caching. It took long enough that I'd just pick something and hope.
So I built it into a tool. It's at quantacost.com. Paste text, pick a model, see what it costs. Compare 39 models side by side. Free, no signup. Prices are verified every morning against the official pricing pages, because I got burned once using someone else's calculator with numbers that were a year stale.
The right model for most tasks is not the smartest one. It's the cheapest one that doesn't fail.


