I Replaced 12 Kitchen Managers Guessing "How Much Chicken Do We Need" With 3 ML Models. Here's the Entire Architecture.

Dev.to / 4/11/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical Usage

共有:

Key Points

The article describes a restaurant chain’s inventory and procurement workflow across 12 locations that relies on managers’ gut feelings, spreadsheets, and phone calls rather than reliable data tracking.
It highlights a major “waste data gap,” where ingredients used during the day are not reconciled against what was ordered, leaving the true loss and consumption pattern unknown.
The author proposes replacing 12 human “how much chicken do we need” decisions with three machine-learning models designed to drive ordering more systematically.
It outlines how the current process includes several automatable, mechanical steps (ordering entry, supplier quote comparison, delivery reconciliation, and end-of-month reporting), but lacks the feedback loop needed for learning and optimization.

This is a case study: AI in Supply Chain
Every restaurant chain has the same dirty secret. Nobody actually knows how much food they waste.

I worked on a system for a 12-location restaurant chain where the entire inventory process was running on vibes. Kitchen manager walks in at 7 AM, looks around, thinks "yeah we're low on chicken", calls procurement, says "send 20 kg". Thats it. Thats the system.

No data. No tracking. No feedback loop. Just a human eyeballing a fridge and making a phone call.

The mess we started with

Let me walk you through what actually happens every single day across 12 restaurants.

Morning (per restaurant):
Kitchen manager does a visual walkthrough. Decides whats low based on gut feeling and experience. Writes it down on paper. Sometimes doesnt write it down at all, just remembers. Calls the central procurement office and dictates what they need.
Procurement office:
Officer enters the order into a spreadsheet. Then calls 3-4 suppliers asking for price quotes. Same suppliers. Same conversation. Every single week. Compares quotes, picks the cheapest, places the order.
Next day:
Delivery arrives. Sometimes its complete, sometimes its not. Kitchen manager checks it against the order. If somethings wrong, calls procurement again. More phone calls.
The gap nobody talks about:
Throughout the day the kitchen uses ingredients. But nobody tracks how much was actually used versus how much was ordered. You ordered 20 kg chicken. The POS system shows you sold dishes that should use about 12 kg. End of day you have 3 kg left. Where did the other 5 kg go? Nobody knows. Nobody is even asking the question.
Month end:
Each of the 12 restaurant managers compiles their ordering costs into a spreadsheet. Emails it to head office. Someone at head office manually merges 12 spreadsheets into one P&L report. Takes about 8 hours. The report is full of errors but nobody has the energy to check.

Head office looks at the consolidated report and tries to spot problems. But they cant. They dont have waste data. They dont have supplier performance data. They dont know which location is over-ordering. They're making decisions blind.

I mapped 14 steps in this workflow. Here's how they break down.

The pattern is clear. Most steps are mechanical and can be automated with tools. The judgment steps can be replaced with ML models. And the biggest problem (step 11) isnt even a bad process its a missing process entirely.

# Inventory & Procurement Automation Pipeline

| Step                          | Component                              | Why |
|-------------------------------|----------------------------------------|-----|
| Visual inventory check        | TOOL (digital scales + POS data) as TRIGGER | Replace eyeballing with actual measurement. POS tells you what was sold. Scales tell you what's left. This data arriving each morning triggers the whole system. |
| Write order list              | DETERMINISTIC ML (Demand Predictor)    | A regression model takes day-of-week, reservations, weather, past sales, current inventory and outputs precise order quantities. This is not a language problem. It's a math problem. |
| Call procurement              | TOOL (automated routing)               | The call is eliminated entirely. System sends computed order to the procurement workflow. |
| Enter into spreadsheet        | TOOL (database)                        | Spreadsheet replaced by structured database. Orders logged automatically. |
| Call suppliers for quotes     | TOOL (supplier API)                    | Suppliers provide price lists via API or weekly upload. System queries automatically. |
| Compare quotes, pick supplier | DETERMINISTIC ML (Supplier Scorer)     | Score by price + on-time delivery rate + quality rejection rate. A weighted scoring model, not an LLM having a think about it. |
| Place order                   | TOOL (supplier API)                    | Auto-execute within guardrails. |
| Delivery arrives              | TOOL (delivery confirmation)           | Kitchen manager confirms receipt in app. Quantities logged against order. |
| Check delivery vs order       | TOOL (automated matching)              | System compares received vs ordered. Flags mismatches. |
| Resolve mismatches            | HUMAN CHECKPOINT                       | Mismatches above 10% escalated to procurement manager with full context. Small variances auto-accepted and logged. |
| Track usage vs ordered        | DETERMINISTIC ML (Waste Detector)      | Ordered 20 kg. POS shows 12 kg used in dishes. Scale shows 3 kg remaining. 5 kg unaccounted = waste. Model flags restaurants above benchmark. |
| Monthly cost compilation      | TOOL (automated report)                | All orders already in database. Monthly totals are a SQL query. |
| Consolidate 12 restaurants    | TOOL (dashboard)                       | One query across all locations. Real-time. |
| Review for anomalies          | LLM (narrative) + HUMAN CHECKPOINT     | LLM generates readable brief. "Restaurant 7 has 23% higher chicken costs than chain average. Waste rate is 18% vs chain average 9%." Human reads it and decides what to do. |

Notice: the LLM shows up exactly once. At the very end. For narration. It does not make a single decision in this entire pipeline.

System architecture

The system has three layers that run in parallel.

Layer 1: Restaurant agents (x12, running independently)

Each restaurant runs its own agent. Same logic, different data. They don't wait for each other.

TRIGGER: Daily at 5 AM

→ [TOOL] Pull POS sales data (yesterday)
→ [TOOL] Pull digital scale readings (current inventory)

→ [DETERMINISTIC ML] Demand Predictor
    Inputs: day of week, reservations, weather,
            yesterday's sales, historical patterns,
            current inventory
    Output: order quantity per ingredient
            with confidence interval

→ [DETERMINISTIC ML] Waste Detector
    Inputs: yesterday's opening inventory,
            deliveries received, POS-derived usage,
            closing inventory from scale
    Output: waste amount per ingredient,
            flag if above threshold

→ [TOOL] Query supplier prices via API

→ [DETERMINISTIC ML] Supplier Scorer
    Inputs: current prices, historical on-time rate,
            quality rejection rate
    Output: ranked supplier per ingredient

Layer 2: Decision branching

This is where the system decides what needs a human and what doesnt.

IF order < ₹50K
   AND all items from preferred suppliers
   AND waste levels normal:
   → AUTO-EXECUTE order via supplier API
   → Log to database

IF order ≥ ₹50K
   OR non-standard supplier
   OR unusual quantity (>2x average):
   → HUMAN CHECKPOINT
   → Procurement manager approves/modifies/rejects
   → Execute approved order

IF waste flag triggered (>15% waste on any ingredient):
   → ALERT restaurant manager + head office
   → [LLM] Generate waste analysis brief
   → HUMAN investigates and logs root cause

The auto-execute path handles the routine. The human checkpoint catches the unusual. The waste alert handles the unknown. Three paths, clear rules, no ambiguity.

Layer 3: Head office consolidation agent

Runs daily and monthly. Aggregates everything.

DAILY:
→ [TOOL] Aggregate orders, costs, waste across 12 locations
→ [DETERMINISTIC ML] Anomaly detector
   Flags restaurants above/below cost benchmarks
→ Dashboard updated

MONTHLY:
→ [TOOL] Generate consolidated P&L from database
→ [DETERMINISTIC ML] Trend analysis
→ [LLM] Generate monthly narrative:
   "Chain-wide food cost: 32.4% of revenue (target 30%).
    Top performer: Restaurant 2 at 28.1%.
    Needs attention: Restaurant 9 at 37.2%,
    driven by 22% waste rate on vegetables."
→ HUMAN CHECKPOINT: Head office reviews and decides

The feedback loop

This is what makes the system get smarter over time.
Every order, every delivery, every waste measurement, every demand prediction gets stored. The system tracks prediction accuracy.

"We predicted 18 kg chicken demand for Restaurant 3 on a Tuesday. Actual was 21 kg. Error: 16%."

This feeds back into the demand predictor. Over weeks and months, the model learns each restaurant's patterns. It learns that Restaurant 5 always spikes on Fridays.

That Restaurant 9 has higher waste on vegetables every monsoon season. That Supplier B's delivery times slip during festivals.

The ML models dont stay static. They improve because the memory layer captures outcomes, not just decisions.

Failure mode analysis

Every component will fail eventually. The question isnt "will it fail" but "what happens when it does."

POS data feed goes down: Kitchen manager enters yesterday's estimated covers manually. System uses last similar day as a proxy. Alert goes to IT.

Scales malfunction or staff skip weighing: System flags missing reading and falls back to calculated inventory: yesterday's stock + deliveries - POS usage. Not perfect but better than nothing.

Demand predictor is wildly wrong: Confidence intervals widen automatically when recent errors are high. Low-confidence predictions get a 20% buffer and human review flag. Model retrains weekly.

Supplier API is down: Order gets queued. SMS notification sent to procurement officer with order details for manual phone placement. Order logged in app when confirmed.

Waste detector throws false positives: Waste alerts require 2+ consecutive days above threshold before escalating. Single day spikes get noted but dont trigger alarms. Reduces alert fatigue.

Auto-execution bug orders 10x quantity: Hard guardrail. No auto-order can exceed 3x the 4-week average for that ingredient at that restaurant. Anything above requires human approval. Daily spend cap per location.

LLM hallucinates a number in the monthly report: LLM never generates numbers. All figures come from the database as structured data. LLM only wraps them in natural language. If LLM is unavailable, the dashboard with raw numbers still works.

The design principle: If any ML component fails, the system degrades to the current manual process. Not to something worse. The worst case scenario is "things go back to how they are today." Thats the floor, not a cliff.

Conclusion:
ML decides. LLM explains. Humans approve. Thats the pattern.

Black Hat USA

AI Business

Black Hat Asia

AI Business

Research with ChatGPT

Dev.to

Silicon Valley is quietly running on Chinese open source models and almost nobody is talking about it

Reddit r/LocalLLaMA

Why AI Product Quality Is Now an Evaluation Pipeline Problem, Not a Model Problem