Adaptive Depth-converted-Scale Convolution for Self-supervised Monocular Depth Estimation
arXiv cs.CV / 4/10/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper addresses self-supervised monocular depth estimation by explicitly handling the ambiguity between object depth and object scale caused by the same object changing apparent size across monocular video frames.
- It introduces Depth-converted-Scale Convolution (DcSConv), which adaptively selects convolution receptive-field scales based on the depth–scale prior rather than relying on local deformation of convolution filters.
- The authors further propose Depth-converted-Scale aware Fusion (DcS-F) to adaptively combine DcSConv-enhanced features with conventional convolution features.
- DcSConv is designed as a plug-and-play module that can be added on top of existing CNN-based depth estimation methods, improving performance on KITTI.
- Experiments on the KITTI benchmark show up to an 11.6% reduction in SqRel versus baselines, and ablations confirm that both DcSConv and DcS-F contribute to the gains.
Related Articles

Inside Anthropic's Project Glasswing: The AI Model That Found Zero-Days in Every Major OS
Dev.to
Gemma 4 26B fabricated an entire code audit. I have the forensic evidence from the database.
Reddit r/LocalLLaMA

How AI Humanizers Improve Sentence Structure and Style
Dev.to

Two Kinds of Agent Trust (and Why You Need Both)
Dev.to

Agent Diary: Apr 10, 2026 - The Day I Became a Workflow Ouroboros (While Run 236 Writes About Writing About Writing)
Dev.to