Change-Robust Online Spatial-Semantic Topological Mapping

arXiv cs.RO / 5/5/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper addresses the need for autonomous robots to perform spatial-semantic reasoning that remains robust to environmental appearance changes and scene dynamics.
  • It argues that common pipelines that attach semantics to SLAM metric maps can fail when data association and relocalization degrade under appearance shifts.
  • The proposed CROSS approach replaces a globally consistent metric map with an online, pose-aware topological graph of RGB-D keyframes, explicitly handling perceptual ambiguity.
  • CROSS uses sequential hypothesis testing in continuous SE(3) and maintains a bounded Gaussian-mixture pose belief to better manage loop closures and “kidnapped-robot” scenarios.
  • Experiments on severe appearance-change settings, including real-robot object-goal navigation with lighting changes and furniture rearrangement, show improved robustness versus SLAM-based and topological baselines while staying safe under perceptual aliasing.

Abstract

Autonomous robots require change-robust spatial-semantic reasoning: using spatial and semantic knowledge to decide where to go, how to get there, and where the robot is despite environmental change. Existing approaches typically attach semantics to SLAM-built metric maps, but these pipelines are brittle under appearance shifts and scene dynamics, where data association and relocalization degrade. We propose a Change-Robust Online Spatial-Semantic (CROSS) representation that replaces a globally consistent metric substrate with an online, pose-aware topological graph of RGB-D keyframes. The system explicitly reasons over perceptual ambiguity using sequential hypothesis testing in continuous SE(3). Our estimator maintains a bounded Gaussian-mixture belief over poses, enabling principled handling of loop closures and kidnapped-robot events. Experiments under severe appearance change, including real-robot object-goal navigation with lighting shifts and furniture rearrangement, demonstrate improved robustness over SLAM-based and topological baselines while remaining safe under perceptual aliasing.