EfficientPENet: Real-Time Depth Completion from Sparse LiDAR via Lightweight Multi-Modal Fusion
arXiv cs.CV / 4/22/2026
📰 NewsDeveloper Stack & InfrastructureSignals & Early TrendsModels & Research
Key Points
- EfficientPENet targets real-time depth completion from sparse LiDAR plus RGB, addressing the latency and model-size limits of existing heavy backbones on embedded hardware.
- The proposed architecture uses a two-branch design, swapping the depth encoder to a ConvNeXt-based backbone with sparsity-invariant convolutions for the LiDAR stream and an RGB branch built from ImageNet-pretrained ConvNeXt blocks.
- Predictions are refined with a Convolutional Spatial Propagation Network (CSPN) and fused using late fusion combined with multi-scale deep supervision.
- A position-aware test-time augmentation corrects coordinate tensors when horizontally flipping inputs, improving consistency and reducing inference error.
- On the KITTI depth completion benchmark, EfficientPENet reports RMSE of 631.94 mm with 36.24M parameters and 20.51 ms latency (48.76 FPS), delivering a large speed/size improvement versus BP-Net while keeping competitive accuracy.
Related Articles
Context Engineering for Developers: A Practical Guide (2026)
Dev.to
GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.
Dev.to
AI Visibility Tracking Exploded in 2026: 6 Tools Every Brand Needs Now
Dev.to
I Built an AI Image Workflow with GPT Image 2.0 (+ Fixing Its Biggest Flaw)
Dev.to
Max-and-Omnis/Nemotron-3-Super-64B-A12B-Math-REAP-GGUF
Reddit r/LocalLLaMA