Towards Near-Real-Time Telemetry-Aware Routing with Neural Routing Algorithms

arXiv cs.LG / 4/6/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper formulates telemetry-aware routing as a delay-aware, closed-loop control problem to account for communication and inference latency that prior neural routing work often ignored.
It introduces a training/evaluation framework that explicitly models delayed network-wide information and restricts assumptions about available telemetry state.
The proposed LOGGIA method uses a scalable graph neural network to predict log-space link weights from attributed topology-and-telemetry graphs, combining data-driven pre-training with on-policy reinforcement learning.
Experiments across synthetic and real topologies, including unseen mixed TCP/UDP traffic bursts, show LOGGIA outperforming shortest-path baselines while neural baselines degrade when realistic delays are enforced.
The results indicate neural routing performs best with fully local deployment, where each router independently observes state and infers actions rather than relying on centralized control.

Abstract

Routing algorithms are crucial for efficient computer network operations, and in many settings they must be able to react to traffic bursts within milliseconds. Live telemetry data can provide informative signals to routing algorithms, and recent work has trained neural networks to exploit such signals for traffic-aware routing. Yet, aggregating network-wide information is subject to communication delays, and existing neural approaches either assume unrealistic delay-free global states, or restrict routers to purely local telemetry. This leaves their deployability in real-world environments unclear. We cast telemetry-aware routing as a delay-aware closed-loop control problem and introduce a framework that trains and evaluates neural routing algorithms, while explicitly modeling communication and inference delays. On top of this framework, we propose LOGGIA, a scalable graph neural routing algorithm that predicts log-space link weights from attributed topology-and-telemetry graphs. It utilizes a data-driven pre-training stage, followed by on-policy Reinforcement Learning. Across synthetic and real network topologies, and unseen mixed TCP/UDP traffic sequences, LOGGIA consistently outperforms shortest-path baselines, whereas neural baselines fail once realistic delays are enforced. Our experiments further suggest that neural routing algorithms like LOGGIA perform best when deployed fully locally, i.e., observing network states and inferring actions at every router individually, as opposed to centralized decision making.