Policy-Governed LLM Routing with Intent Matching for Instrument Laboratories

arXiv cs.AI / 5/1/2026

📰 NewsDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

共有:

Key Points

The paper introduces a policy-governed routing and governance system for LLM-based engineering lab tutoring that balances helpful assistance with preserving learning opportunities.
It consists of Routiium, an OpenAI-compatible gateway for managing multiple LLM backends with configurable prompt modifications and detailed usage logging, and EduRouter, a policy-aware service that enforces lab budgets and approvals via embedding-based intent/question matching.
Trace-driven simulations across two engineering lab settings (LED characterization and RC circuit analysis) show that governed policies substantially improve learning-alignment metrics versus ungoverned operation.
Live-model replay over 100 queries indicates EduRouter routes 75% of queries to a local model, cutting token costs by 66% compared with routing everything to premium models while maintaining a canonical hit rate of 1.0 on a curated intent question bank.
The authors release Routiium, EduRouter, canonical-task tooling, and simulator configurations to enable replication and future classroom studies.

Abstract

AI tutoring systems in engineering labs face a tension between providing sufficient assistance and preserving learning opportunities. Existing systems typically offer instructors limited control over assistance timing, content, or cost. This paper describes a routing and governance system for LLM-based lab assistance comprising two components: Routiium, an OpenAI-compatible gateway that manages multiple LLM backends with configurable prompt modifications and usage logging, and EduRouter, a policy-aware routing service that enforces per-lab budgets, approval workflows, and embedding-based question matching. We evaluated the system using trace-driven simulation calibrated from two engineering labs (LED characterization, RC circuit analysis) and a 100-query replay through live models. In simulations, governed policies (P1/P2) increased challenge-alignment index from 0.90 to 0.98 and overlay-adherence score from 0.69 to 0.87 compared to ungoverned operation (P0). The productive-struggle window metric increased from 1.4 to 3.6 simulated turns before high-scaffold hints appeared. In the 100-query replay, EduRouter routed 75% of queries to a local model, reducing token costs by 66% (

0.087 vs.

0.26 for all-premium routing) while maintaining canonical hit rate of 1.0 for the curated 89-intent question bank. We release Routiium, EduRouter, canonical-task tooling, and simulator configurations to support replication and future classroom studies.