Indoor Asset Detection in Large Scale 360{\deg} Drone-Captured Imagery via 3D Gaussian Splatting

arXiv cs.CV / 4/8/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes an object-level detection and segmentation method for indoor assets in 3D Gaussian Splatting (3DGS) scenes reconstructed from large-scale 360° drone imagery.
  • It introduces a “3D object codebook” that combines mask semantics with spatial attributes of Gaussian primitives to improve multi-view mask association.
  • The approach merges 2D detection/segmentation outputs across multiple views using semantic and spatial constraints to form coherent 3D object instances.
  • Experiments on two large indoor scenes show strong multi-view mask consistency, with F1 improving by 65% over state-of-the-art baselines.
  • For 3D indoor asset detection, the method delivers an 11% mAP improvement over baseline techniques.

Abstract

We present an approach for object-level detection and segmentation of target indoor assets in 3D Gaussian Splatting (3DGS) scenes, reconstructed from 360{\deg} drone-captured imagery. We introduce a 3D object codebook that jointly leverages mask semantics and spatial information of their corresponding Gaussian primitives to guide multi-view mask association and indoor asset detection. By integrating 2D object detection and segmentation models with semantically and spatially constrained merging procedures, our method aggregates masks from multiple views into coherent 3D object instances. Experiments on two large indoor scenes demonstrate reliable multi-view mask consistency, improving F1 score by 65% over state-of-the-art baselines, and accurate object-level 3D indoor asset detection, achieving an 11% mAP gain over baseline methods.