TRACER: Trace-Based Adaptive Cost-Efficient Routing for LLM Classification
arXiv cs.AI / 4/17/2026
📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research
Key Points
- TRACER is an open-source routing system that trains lightweight ML surrogates using labeled input-output pairs already collected in production logs from an LLM classification endpoint.
- It deploys a surrogate only when a “parity gate” indicates its agreement with the LLM exceeds a user-defined threshold (α), aiming to reduce marginal inference cost.
- TRACER creates interpretability artifacts to make the surrogate-to-LLM handoff boundary transparent, including what input regions the surrogate covers, where it plateaus, and why it defers.
- Experiments on benchmark intent classification show high surrogate coverage (83–100% with a 77-class, Sonnet 4.6 setup, and full replacement on a 150-class task), while a natural language inference task correctly prevents deployment when reliable separation is not possible.
💡 Insights using this article
This article is featured in our daily AI news digest — key takeaways and action items at a glance.



