| Created separate private API keys for each service within LiteLLM and started logging the usage via Prometheus to view in Grafana. Surprised the Frigate GenAI summaries tokens quickly add up! This view is only the past 6 hours. [link] [comments] |
"What do you guys even use local LLMs for?" Me: A lot
Reddit r/LocalLLaMA / 4/30/2026
💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsTools & Practical Usage
Key Points
- A Reddit user shares that they use local LLMs via LiteLLM with separate private API keys per service for better access control and tracking.
- They log local LLM usage through Prometheus and visualize the metrics in Grafana to monitor real demand and performance.
- The user notes that Frigate GenAI summary token usage can accumulate very quickly, making monitoring important.
- They emphasize the token costs/usage are already significant even when looking only at the last six hours.
- The post implicitly argues that local LLM users should measure and manage token consumption across services rather than assume it stays low.
Related Articles

Black Hat USA
AI Business

What to Build Still Beats How
Dev.to

I Build Systems, Flip Land, and Drop Trap Music — Meet Tyler Moncrieff aka Father Dust
Dev.to

From Claim Denials to Smart Decisions: My Experience Using AI in Healthcare Claims Processing
Dev.to

Whatsapp AI booking system in one prompt in 5 minutes
Dev.to