AgentGate: A Lightweight Structured Routing Engine for the Internet of Agents

arXiv cs.AI / 4/10/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces AgentGate, a lightweight structured routing engine aimed at efficiently dispatching requests in an “Internet of Agents” under latency, privacy, and cost constraints.
  • Rather than using open-ended text generation for routing, AgentGate frames dispatch as a constrained decision problem split into two stages: action decision and structural grounding.
  • The system decides among options such as single-agent invocation, multi-agent planning, direct response, or safe escalation, then instantiates the chosen action into executable structured outputs (e.g., target agents, arguments, or plans).
  • To make compact LLMs effective in this routing setting, the authors propose routing-oriented fine-tuning with candidate-aware supervision and hard negative examples.
  • Experiments on a curated benchmark using several 3B–7B open-weight models suggest structured routing is practical for resource-constrained deployments, with model differences mainly affecting action prediction, candidate selection, and grounding quality.

Abstract

The rapid development of AI agent systems is leading to an emerging Internet of Agents, where specialized agents operate across local devices, edge nodes, private services, and cloud platforms. Although recent efforts have improved agent naming, discovery, and interaction, efficient request dispatch remains an open systems problem under latency, privacy, and cost constraints. In this paper, we present AgentGate, a lightweight structured routing engine for candidate-aware agent dispatch. Instead of treating routing as unrestricted text generation, AgentGate formulates it as a constrained decision problem and decomposes it into two stages: action decision and structural grounding. The first stage determines whether a query should trigger single-agent invocation, multi-agent planning, direct response, or safe escalation, while the second stage instantiates the selected action into executable outputs such as target agents, structured arguments, or multi-step plans. To adapt compact models to this setting, we further develop a routing-oriented fine-tuning scheme with candidate-aware supervision and hard negative examples. Experiments on a curated routing benchmark with several 3B--7B open-weight models show that compact models can provide competitive routing performance in constrained settings, and that model differences are mainly reflected in action prediction, candidate selection, and structured grounding quality. These results indicate that structured routing is a feasible design point for efficient and privacy-aware agent systems, especially when routing decisions must be made under resource-constrained deployment conditions.