SEAR: Schema-Based Evaluation and Routing for LLM Gateways

arXiv cs.AI / 3/31/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • SEAR is a schema-based evaluation and routing system designed for multi-model, multi-provider LLM gateways, aiming to improve fine-grained quality signals for production decisions.
  • It introduces an extensible relational schema that ties together LLM evaluation signals (e.g., context/intent/response characteristics, quality scores, issue attribution) with gateway operational metrics (latency, cost, throughput) via consistent cross-table links.
  • SEAR proposes self-contained, in-schema signal instructions and multi-stage generation to produce database-ready structured outputs, rather than relying on shallow classifiers.
  • By deriving signals through LLM reasoning, SEAR captures more complex request semantics and provides human-interpretable routing explanations.
  • Experiments on thousands of production sessions show strong signal accuracy on human-labeled data and routing outcomes that can reduce costs while maintaining comparable quality.

Abstract

Evaluating production LLM responses and routing requests across providers in LLM gateways requires fine-grained quality signals and operationally grounded decisions. To address this gap, we present SEAR, a schema-based evaluation and routing system for multi-model, multi-provider LLM gateways. SEAR defines an extensible relational schema covering both LLM evaluation signals (context, intent, response characteristics, issue attribution, and quality scores) and gateway operational metrics (latency, cost, throughput), with cross-table consistency links across around one hundred typed, SQL-queryable columns. To populate the evaluation signals reliably, SEAR proposes self-contained signal instructions, in-schema reasoning, and multi-stage generation that produces database-ready structured outputs. Because signals are derived through LLM reasoning rather than shallow classifiers, SEAR captures complex request semantics, enables human-interpretable routing explanations, and unifies evaluation and routing in a single query layer. Across thousands of production sessions, SEAR achieves strong signal accuracy on human-labeled data and supports practical routing decisions, including large cost reductions with comparable quality.