The observation that started this: most of what people use AI for every day - summarising, drafting, classifying, extracting etc doesn't actually require a frontier model. Any competent 8-70B model handles those just as well. But most people run everything through Claude or ChatGPT out of habit.
I built Followloop (followloop.app) to solve this automatically. It classifies each task by complexity and routes it:
- Simple tasks → Cerebras Llama (2000 TPS, 1M tokens/day free), Groq, Gemini Flash
- Moderate tasks → Groq 70B, SambaNova
- Complex tasks → Claude Haiku as fallback
The dashboard shows your actual cost alongside what you'd have paid running everything on Claude Sonnet. I've been running it on my own AI workflow for two weeks: 9,200 tasks routed, $21.24 saved, $0.1360 actual cost. About 157× cheaper per token than Sonnet on average.
Works with any AI setup via MCP (Model Context Protocol) - Claude Desktop, Cursor, Claude Code, or anything MCP-compatible.
Also has a library of 1,300+ safety-screened MCP servers as a bonus feature.
$5/month at followloop.app
[link] [comments]