DreamStereo: Towards Real-Time Stereo Inpainting for HD Videos

arXiv cs.CV / 4/15/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • DreamStereo proposes a real-time approach to stereo video inpainting by combining geometry-aware warping, occlusion/mask generation, and sparsity-focused diffusion inference.
  • It introduces Gradient-Aware Parallax Warping (GAPW) to produce continuous edges and smooth occlusion regions using backward warping plus gradients of the coordinate mapping.
  • It adds a Parallax-Based Dual Projection (PBDP) strategy to generate geometrically consistent stereo inpainting pairs and accurate occlusion masks without needing stereo video inputs.
  • It presents Sparsity-Aware Stereo Inpainting (SASI), which cuts more than 70% of redundant tokens and achieves a reported 10.7x speedup during diffusion inference.
  • The method is reported to produce comparable quality to full computation while enabling HD (768×1280) stereo inpainting at 25 FPS on a single A100 GPU.

Abstract

Stereo video inpainting, which aims to fill the occluded regions of warped videos with visually coherent content while maintaining temporal consistency, remains a challenging open problem. The regions to be filled are scattered along object boundaries and occupy only a small fraction of each frame, leading to two key challenges. First, existing approaches perform poorly on such tasks due to the scarcity of high-quality stereo inpainting datasets, which limits their ability to learn effective inpainting priors. Second, these methods apply equal processing to all regions of the frame, even though most pixels require no modification, resulting in substantial redundant computation. To address these issues, we introduce three interconnected components. We first propose Gradient-Aware Parallax Warping (GAPW), which leverages backward warping and the gradient of the coordinate mapping function to obtain continuous edges and smooth occlusion regions. Then, a Parallax-Based Dual Projection (PBDP) strategy is introduced, which incorporates GAPW to produce geometrically consistent stereo inpainting pairs and accurate occlusion masks without requiring stereo videos. Finally, we present Sparsity-Aware Stereo Inpainting (SASI), which reduces over 70% of redundant tokens, achieving a 10.7x speedup during diffusion inference and delivering results comparable to its full-computation counterpart, enabling real-time processing of HD (768 x 1280) videos at 25 FPS on a single A100 GPU.

DreamStereo: Towards Real-Time Stereo Inpainting for HD Videos | AI Navigate