First Shape, Then Meaning: Efficient Geometry and Semantics Learning for Indoor Reconstruction
arXiv cs.CV / 5/6/2026
📰 NewsDeveloper Stack & InfrastructureModels & Research
Key Points
- The paper proposes FSTM, a two-step neural surface reconstruction method that learns indoor geometry first and semantics second to better recover scene context.
- It performs a “geometry warm-up” using RGB and geometric cues without semantic supervision, then estimates semantic fields after geometry stabilizes.
- Compared with standard joint geometry+semantics optimization and multi-SDF designs, FSTM improves reconstruction quality while avoiding specialized modules or complex multi-SDF architectures.
- Experiments on synthetic and real indoor datasets show FSTM trains 2.3× faster on Replica, is more robust on ScanNet++, and achieves higher recall by reconstructing more object surfaces.
- The authors announce that code will be released publicly, supporting adoption and further experimentation.
Related Articles

SIFS (SIFS Is Fast Search) - local code search for coding agents
Dev.to

BizNode's semantic memory (Qdrant) makes your bot smarter over time — it remembers past conversations and answers...
Dev.to

Google AI Releases Multi-Token Prediction (MTP) Drafters for Gemma 4: Delivering Up to 3x Faster Inference Without Quality Loss
MarkTechPost
Solidity LM surpasses Opus
Reddit r/LocalLLaMA

Quality comparison between Qwen 3.6 27B quantizations (BF16, Q8_0, Q6_K, Q5_K_XL, Q4_K_XL, IQ4_XS, IQ3_XXS,...)
Reddit r/LocalLLaMA