Production AI very different from the demos [D]

Reddit r/MachineLearning / 5/6/2026

💬 OpinionDeveloper Stack & InfrastructureSignals & Early TrendsIndustry & Market Moves

共有:

Key Points

The author found that AI features can look inexpensive in demos and prototypes, but costs can rise sharply in production due to higher traffic, longer/unclear customer questions, and added context retrieval that increases input length.
They started with GPT-4o because early response quality was sufficient, but as usage volume grew the billing became significantly higher, creating accounting and budgeting challenges.
The OpenAI dashboard shows total spend but not which specific feature or model is driving costs, forcing the author to manually map token exports back to product features.
Now the author effectively owns the “cost question,” spending significant time each week reconciling token counts with feature usage without fully trusting the resulting numbers.

Moved an AI feature into production a few months ago and the cost profile has been a constant surprise since so the demos and the early prototypes ran cheap because the volume was tiny + the prompts were short but when it hit traffic the token usage scaled a lot. I think it was partly because customers ask longer and unclear questions than our test set because we ended up adding context retrieval that doubled the input length on every call.

We started on GPT4o for the early version and the response quality was good enough that nobody pushed back but after a few weeks of volume the bill came in higher and finance had no way to break out which feature or which model was driving it. I am pulling exports from the OpenAI dashboard and trying to map them back to features manually which is not sustainable.
I shipped the feature and now I am the de facto owner of the cost question. The OpenAI dashboard tells me the total but it does not tell me what I actually need to answer and I spend half a day every week trying to reconcile token counts against feature usage but I am still not confident in the numbers I hand off.

submitted by /u/Far-Football3763
[link] [comments]

Black Hat USA

AI Business

Transform Your Blurry Photos into HD Masterpieces, Instantly!

Dev.to

6 New Moats for AI Agent Infrastructure — Trust Score, Deployment, SLA, Identity, Compliance-as-Code

Dev.to

There will still be art in software

Dev.to

Google Home’s Gemini AI can handle more complicated requests

The Verge

Production AI very different from the demos [D]

Key Points

Related Articles

Black Hat USA

Transform Your Blurry Photos into HD Masterpieces, Instantly!

6 New Moats for AI Agent Infrastructure — Trust Score, Deployment, SLA, Identity, Compliance-as-Code

There will still be art in software

Google Home’s Gemini AI can handle more complicated requests

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Related Articles

Black Hat USA

Transform Your Blurry Photos into HD Masterpieces, Instantly!

6 New Moats for AI Agent Infrastructure — Trust Score, Deployment, SLA, Identity, Compliance-as-Code

There will still be art in software

Google Home&#8217;s Gemini AI can handle more complicated requests

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Google Home’s Gemini AI can handle more complicated requests