Hybrid Visual Telemetry for Bandwidth-Constrained Robotic Vision: A Pilot Study with HEVC Base Video and JPEG ROI Stills

arXiv cs.RO / 5/5/2026

💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a common bandwidth problem in robotic/surveillance vision: a single compressed stream can preserve motion but often loses the fine detail required for reliable object recognition and decision-making.
  • It proposes a two-channel hybrid visual telemetry approach that combines a low-bitrate HEVC base video stream for continuous scene awareness with selectively transmitted high-detail ROI stills for event-driven identification and analytics.
  • Rather than claiming a new still-image codec is superior, the study establishes the hybrid transmission paradigm using a reproducible stack (x265/HEVC for video plus JPEG for ROI refinement).
  • The authors formalize the task as bitrate-constrained information selection and compare video-only versus hybrid schemes under matched total communication budgets using UAV-oriented datasets, multiple ROI triggering policies, and object-level classification refinement.
  • The work is positioned as a methodological foundation for a follow-up study exploring “JPEG AI” as a semantic still-image channel within the same hybrid architecture.

Abstract

Bandwidth-constrained robotic and surveillance systems often rely on a single compressed video stream to support both continuous scene awareness and downstream machine perception. In practice, this creates a mismatch: low-bitrate video can preserve motion and coarse context, but often loses the fine local detail needed for reliable object recognition and decision-making. Motivated by a hybrid architecture in which low-resolution video supports dynamic scene understanding while eventdriven high-detail regions of interest (ROIs) support close-up identification and analytics, this paper formalizes a two-channel visual telemetry scheme in which a continuous low-bitrate video stream is augmented by selectively transmitted high-detail still ROIs. This first paper does not attempt to prove the superiority of a new still-image codec. Instead, it establishes the hybrid transmission paradigm itself using a practical and reproducible codec stack: x265/HEVC for the base video stream and JPEG stills for ROI refinement. We formulate the problem as bitrate-constrained information selection for robotic vision and define an experimental protocol in which video-only and hybrid schemes are compared under matched total communication budgets. The study is designed around UAV-oriented datasets, two practical bitrate regimes, several ROI triggering policies, and object-level classification refinement on selectively transmitted ROI stills. The resulting paper lays the methodological foundation for a second-stage investigation of JPEG AI as the semantic still-image channel within the same hybrid architecture.