AI Navigate

ProGVC: Progressive-based Generative Video Compression via Auto-Regressive Context Modeling

arXiv cs.CV / 3/19/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • ProGVC introduces progressive-based generative video compression that uses hierarchical multi-scale residual token maps to enable flexible rate adaptation by transmitting coarse-to-fine scales progressively.
  • A Transformer-based multi-scale autoregressive context model estimates token probabilities for efficient entropy coding and can predict truncated fine-scale tokens at the decoder to restore perceptual details.
  • The framework unifies progressive transmission, entropy coding, and detail synthesis within a single codec, enabling scalable, low-bitrate perceptual compression.
  • Experimental results indicate promising perceptual compression performance at low bitrates with practical scalability, suggesting benefits over traditional codecs in perceptual quality and bandwidth efficiency.

Abstract

Perceptual video compression leverages generative priors to reconstruct realistic textures and motions at low bitrates. However, existing perceptual codecs often lack native support for variable bitrate and progressive delivery, and their generative modules are weakly coupled with entropy coding, limiting bitrate reduction. Inspired by the next-scale prediction in the Visual Auto-Regressive (VAR) models, we propose ProGVC, a Progressive-based Generative Video Compression framework that unifies progressive transmission, efficient entropy coding, and detail synthesis within a single codec. ProGVC encodes videos into hierarchical multi-scale residual token maps, enabling flexible rate adaptation by transmitting a coarse-to-fine subset of scales in a progressive manner. A Transformer-based multi-scale autoregressive context model estimates token probabilities, utilized both for efficient entropy coding of the transmitted tokens and for predicting truncated fine-scale tokens at the decoder to restore perceptual details. Extensive experiments demonstrate that as a new coding paradigm, ProGVC delivers promising perceptual compression performance at low bitrates while offering practical scalability at the same time.