AsyncMDE: Real-Time Monocular Depth Estimation via Asynchronous Spatial Memory

arXiv cs.CV / 3/12/2026

💬 OpinionModels & Research

共有:

Key Points

AsyncMDE introduces an asynchronous depth perception system that splits work between a foundation model producing spatial features in the background and a lightweight foreground model that fuses memory with current observations to estimate depth.
The system enables cross-frame feature reuse with complementary fusion and autoregressive memory updates, achieving bounded accuracy degradation across frames.
It is compact (3.83M parameters) and delivers 237 FPS on an RTX 4090, recovering 77% of the accuracy gap to the foundation model with 25x fewer parameters; it also runs at 161 FPS on a Jetson AGX Orin with TensorRT, demonstrating edge feasibility.
Validation on indoor static, dynamic, and synthetic extreme-motion benchmarks shows graceful degradation between refreshes and practical real-time performance.

Abstract

Foundation-model-based monocular depth estimation offers a viable alternative to active sensors for robot perception, yet its computational cost often prohibits deployment on edge platforms. Existing methods perform independent per-frame inference, wasting the substantial computational redundancy between adjacent viewpoints in continuous robot operation. This paper presents AsyncMDE, an asynchronous depth perception system consisting of a foundation model and a lightweight model that amortizes the foundation model's computational cost over time. The foundation model produces high-quality spatial features in the background, while the lightweight model runs asynchronously in the foreground, fusing cached memory with current observations through complementary fusion, outputting depth estimates, and autoregressively updating the memory. This enables cross-frame feature reuse with bounded accuracy degradation. At a mere 3.83M parameters, it operates at 237 FPS on an RTX 4090, recovering 77% of the accuracy gap to the foundation model while achieving a 25X parameter reduction. Validated across indoor static, dynamic, and synthetic extreme-motion benchmarks, AsyncMDE degrades gracefully between refreshes and achieves 161FPS on a Jetson AGX Orin with TensorRT, clearly demonstrating its feasibility for real-time edge deployment.

Two bots, one confused server: what Nimbus revealed about AI agent identity

Dev.to

PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark forFinance

Dev.to

A Coding Implementation to Build an Uncertainty-Aware LLM System with Confidence Estimation, Self-Evaluation, and Automatic Web Research

MarkTechPost

DNA Memory: Making AI Agents Learn, Forget, and Evolve Like a Human Brain

Dev.to

Tinybox- offline AI device 120B parameters

Hacker News

AsyncMDE: Real-Time Monocular Depth Estimation via Asynchronous Spatial Memory

Key Points

Abstract

Related Articles

Two bots, one confused server: what Nimbus revealed about AI agent identity

PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark forFinance

A Coding Implementation to Build an Uncertainty-Aware LLM System with Confidence Estimation, Self-Evaluation, and Automatic Web Research

DNA Memory: Making AI Agents Learn, Forget, and Evolve Like a Human Brain

Tinybox- offline AI device 120B parameters

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer