FF3R: Feedforward Feature 3D Reconstruction from Unconstrained views
arXiv cs.CV / 4/14/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces FF3R, a fully annotation-free, feed-forward framework that unifies geometric and semantic reasoning for 3D reconstruction from unconstrained multi-view image sequences.
- FF3R removes the need for camera poses, depth maps, and semantic labels by using only rendering supervision on RGB and feature maps, aiming to reduce redundant pipelines and error accumulation.
- It tackles global semantic inconsistency and local structural inconsistency using a Token-wise Fusion Module (cross-attention to enrich geometry tokens with semantic context) and a Semantic-Geometry Mutual Boosting mechanism (geometry-guided feature warping plus semantic-aware voxelization).
- Experiments on ScanNet and DL3DV-10K report improved results across novel-view synthesis, open-vocabulary semantic segmentation, and depth estimation, with strong generalization to in-the-wild scenarios.
Related Articles

Black Hat Asia
AI Business

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Don't forget, there is more than forgetting: new metrics for Continual Learning
Dev.to

Microsoft MAI-Image-2-Efficient Review 2026: The AI Image Model Built for Production Scale
Dev.to
Bit of a strange question?
Reddit r/artificial