OrbitStream: Training-Free Adaptive 360-degree Video Streaming via Semantic Potential Fields

arXiv cs.RO / 3/24/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • OrbitStream is a training-free framework for adaptive 360-degree video streaming in teleoperation that targets two hard problems: uncertain viewport prediction under variable gaze and bitrate adaptation over volatile wireless channels.
  • The method models viewport prediction as a Gravitational Viewport Prediction (GVP) task, using semantic objects to create potential fields that attract likely user gaze directions.
  • For buffer regulation, it replaces heavy learning with a Saturation-Based Proportional-Derivative (PD) controller, aiming for robust control behavior rather than black-box decision making.
  • Reported results show 94.7% zero-shot viewport prediction accuracy without user profiling, mean QoE of 2.71 across 3,600 Monte Carlo simulations on diverse network traces, and low average decision latency (~1.01 ms) with minimal rebuffering.
  • The paper positions OrbitStream as competitive with state-of-the-art adaptive algorithms (ranking second among 12), while adding interpretability and eliminating training overhead for safer deployment contexts.

Abstract

Adaptive 360{\deg} video streaming for teleoperation faces dual challenges: viewport prediction under uncertain gaze patterns and bitrate adaptation over volatile wireless channels. While data-driven and Deep Reinforcement Learning (DRL) methods achieve high Quality of Experience (QoE), their "black-box" nature and reliance on training data can limit deployment in safety-critical systems. To address this, we propose OrbitStream, a training-free framework that combines semantic scene understanding with robust control theory. We formulate viewport prediction as a Gravitational Viewport Prediction (GVP) problem, where semantic objects generate potential fields that attract user gaze. Furthermore, we employ a Saturation-Based Proportional-Derivative (PD) Controller for buffer regulation. On object-rich teleoperation traces, OrbitStream achieves a 94.7\% zero-shot viewport prediction accuracy without user-specific profiling, approaching trajectory-extrapolation baselines (\sim98.5\%). Across 3,600 Monte Carlo simulations on diverse network traces, OrbitStream yields a mean QoE of 2.71. It ranks second among 12 evaluated algorithms, close to the top-performing BOLA-E (2.80) while outperforming FastMPC (1.84). The system exhibits an average decision latency of 1.01 ms with minimal rebuffering events. By providing competitive QoE with interpretability and zero training overhead, OrbitStream demonstrates that physics-based control, combined with semantic modeling, offers a practical solution for 360{\deg} streaming in teleoperation.