VDPP: Video Depth Post-Processing for Speed and Scalability

arXiv cs.CV / 4/9/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The paper introduces VDPP (Video Depth Post-Processing), a modular framework designed to improve video depth estimation by enhancing existing post-processing approaches rather than retraining tightly coupled end-to-end models.
VDPP targets geometric refinement in low-resolution space using dense residual learning, replacing costly scene reconstruction with more efficient computation.
The method achieves very high runtime performance, reporting over 43.5 FPS on an NVIDIA Jetson Orin Nano while maintaining temporal coherence comparable to end-to-end systems.
Unlike RGB-dependent alternatives, VDPP is RGB-free, enabling true scalability by immediately integrating with evolving single-image depth estimators without retraining.
The authors position VDPP as a practical real-time, memory-efficient solution for edge deployment, addressing speed, accuracy, and scalability limitations of prior post-processing methods.

Abstract

Video depth estimation is essential for providing 3D scene structure in applications ranging from autonomous driving to mixed reality. Current end-to-end video depth models have established state-of-the-art performance. Although current end-to-end (E2E) models have achieved state-of-the-art performance, they function as tightly coupled systems that suffer from a significant adaptation lag whenever superior single-image depth estimators are released. To mitigate this issue, post-processing methods such as NVDS offer a modular plug-and-play alternative to incorporate any evolving image depth model without retraining. However, existing post-processing methods still struggle to match the efficiency and practicality of E2E systems due to limited speed, accuracy, and RGB reliance. In this work, we revitalize the role of post-processing by proposing VDPP (Video Depth Post-Processing), a framework that improves the speed and accuracy of post-processing methods for video depth estimation. By shifting the paradigm from computationally expensive scene reconstruction to targeted geometric refinement, VDPP operates purely on geometric refinements in low-resolution space. This design achieves exceptional speed (>43.5 FPS on NVIDIA Jetson Orin Nano) while matching the temporal coherence of E2E systems, with dense residual learning driving geometric representations rather than full reconstructions. Furthermore, our VDPP's RGB-free architecture ensures true scalability, enabling immediate integration with any evolving image depth model. Our results demonstrate that VDPP provides a superior balance of speed, accuracy, and memory efficiency, making it the most practical solution for real-time edge deployment. Our project page is at https://github.com/injun-baek/VDPP

Black Hat USA

AI Business

Black Hat Asia

AI Business

[R] The ECIH: Model Modeling Agentic Identity as an Emergent Relational State [R]

Reddit r/MachineLearning

Google DeepMind Unveils Project Genie: The Dawn of Infinite AI-Generated Game Worlds

Dev.to

Melhores Alternativas ao NightCafe em 2026: Acesso API, Recursos Empresariais, Menores Custos

Dev.to

VDPP: Video Depth Post-Processing for Speed and Scalability

Key Points

Abstract

Related Articles

Black Hat USA

Black Hat Asia

[R] The ECIH: Model Modeling Agentic Identity as an Emergent Relational State [R]

Google DeepMind Unveils Project Genie: The Dawn of Infinite AI-Generated Game Worlds

Melhores Alternativas ao NightCafe em 2026: Acesso API, Recursos Empresariais, Menores Custos

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer