OpenAI just had one of its biggest days ever. Free tier gets GPT-4o and image generation. A new o3-Pro model drops at $200/month. Plans for $20,000/month AI agents leak. And Sam Altman confirmed the $20/month Plus price was always meant to be "introductory."
If you've been building on OpenAI's API or paying for Plus, this should make you think.
What actually happened
On April 9, 2025, OpenAI announced a massive upgrade to ChatGPT's free tier — unlimited GPT-4o access, image generation with DALL-E, file analysis, web search, and more. Sounds great, right?
Here's the catch: this is a classic platform play. Get millions of users dependent on the free tier, then gradually push them toward paid plans. Sam Altman essentially said as much — the $20/month Plus price was a "pilot," not a permanent commitment.
Meanwhile, at the top end:
- o3-Pro costs $200/month and uses ~1 million tokens per response
- OpenAI is planning $20,000/month tiers for "PhD-level AI agents"
- The company just raised $40 billion at a $300B valuation — investors expect returns
The pricing trajectory is clear: up.
The dependency problem
If your workflow depends on ChatGPT or the OpenAI API, you're at the mercy of someone else's pricing decisions. Every time OpenAI changes plans, raises API costs, or deprecates a model, you scramble.
This isn't theoretical. OpenAI has changed its API pricing multiple times. Models get deprecated. Rate limits shift. And now they're signaling that even consumer pricing will increase.
The local AI alternative
Running models locally means your costs are fixed after the initial hardware investment. No monthly fees. No API rate limits. No surprise price increases. No one deprecating your model.
Here's what you can run locally today:
For chat and coding:
- Ollama — dead simple CLI tool, one command to run Llama 3, Qwen, Mistral, etc.
- LM Studio — nice GUI, download and run models with a few clicks
- Locally Uncensored — all-in-one desktop app with chat, image gen, and video gen built in
For image generation:
- ComfyUI — node-based workflow, very flexible
- Stable Diffusion via AUTOMATIC1111 — the classic web UI
- Locally Uncensored also bundles image gen through ComfyUI integration
What you need:
- A GPU with 8GB+ VRAM handles most 7-8B parameter models well
- 16GB+ VRAM opens up 13-30B models that genuinely compete with GPT-4
- Models like Qwen 2.5 32B, Llama 3 70B (quantized), and Mistral Large are surprisingly capable
The privacy bonus
OpenAI also launched "connectors" this week — ChatGPT can now index your Google Drive, Notion, and Slack. The r/ChatGPT community immediately raised privacy alarms. When you connect these services, OpenAI creates embeddings of your data stored on their servers.
With local models, your data never leaves your machine. Period. No embeddings on someone else's servers, no ambiguity about training data usage, no risk of a data breach exposing your files.
When cloud AI still makes sense
I'm not saying everyone should ditch OpenAI tomorrow. Cloud AI is still better for:
- Cutting-edge reasoning tasks where the very latest models matter
- Teams that need zero setup and shared access
- Use cases where you need the absolute best model quality regardless of cost
But if you're a developer, researcher, or privacy-conscious user who runs AI daily — having a local setup as your primary tool and cloud as a backup is increasingly the smart move.
Getting started
The barrier to entry for local AI has dropped dramatically in 2025. If you have a halfway decent GPU:
- Install Ollama and run
ollama run llama3— you'll have a local chatbot in under 5 minutes - Try LM Studio if you prefer a GUI
- Check out Locally Uncensored if you want chat + image gen + video gen in one app
The models aren't quite GPT-4 level across every task, but for most daily work — writing, coding, analysis, brainstorming — they're genuinely good enough. And they're free to run, forever.
OpenAI's pricing will keep going up. Your electricity bill won't change much.




