IRIS-SLAM: Unified Geo-Instance Representations for Robust Semantic Localization and Mapping

arXiv cs.RO / 3/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • IRIS-SLAM is proposed as an RGB semantic SLAM system that aims to add deeper semantic understanding and more robust loop closure than prior dense geometric SLAM methods.
  • The method builds unified geometric-instance representations by extending an instance-extended geometry foundation model to predict both dense geometry and cross-view consistent instance embeddings.
  • It uses these instance embeddings for a semantic-synergized data association mechanism and instance-guided loop closure detection, addressing the fragility caused by decoupled semantic mapping architectures.
  • The approach introduces viewpoint-agnostic semantic anchors to connect geometric reconstruction with open-vocabulary mapping, improving consistency across challenging conditions.
  • Experiments (per the abstract) show IRIS-SLAM outperforms existing state-of-the-art methods, especially for map consistency and wide-baseline loop closure reliability.

Abstract

Geometry foundation models have significantly advanced dense geometric SLAM, yet existing systems often lack deep semantic understanding and robust loop closure capabilities. Meanwhile, contemporary semantic mapping approaches are frequently hindered by decoupled architectures and fragile data association. We propose IRIS-SLAM, a novel RGB semantic SLAM system that leverages unified geometric-instance representations derived from an instance-extended foundation model. By extending a geometry foundation model to concurrently predict dense geometry and cross-view consistent instance embeddings, we enable a semantic-synergized association mechanism and instance-guided loop closure detection. Our approach effectively utilizes viewpoint-agnostic semantic anchors to bridge the gap between geometric reconstruction and open-vocabulary mapping. Experimental results demonstrate that IRIS-SLAM significantly outperforms state-of-the-art methods, particularly in map consistency and wide-baseline loop closure reliability.