IGV-RRT: Prior-Real-Time Observation Fusion for Active Object Search in Changing Environments

arXiv cs.RO / 3/24/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper tackles Object Goal Navigation (ObjectNav) in indoor environments where objects may move, which can make past scene knowledge unreliable.
  • It proposes a probabilistic planning framework that fuses uncertainty-aware prior information with online target relevance estimates generated via a Vision Language Model (VLM).
  • The approach uses a dual-layer semantic mapping system: an Information Gain Map (IGM) derived from a 3D scene graph for global guidance, and a VLM score map (VLM-SM) for local validation of the current scene.
  • The real-time planner, IGV-RRT, prioritizes tree expansion toward regions that are both semantically salient and consistent with prior likelihood and online relevance, while maintaining kinematic feasibility.
  • Simulation and real-world experiments show improved search efficiency and success rates over baseline methods under object rearrangement.

Abstract

Object Goal Navigation (ObjectNav) in temporally changing indoor environments is challenging because object relocation can invalidate historical scene knowledge. To address this issue, we propose a probabilistic planning framework that combines uncertainty-aware scene priors with online target relevance estimates derived from a Vision Language Model (VLM). The framework contains a dual-layer semantic mapping module and a real-time planner. The mapping module includes an Information Gain Map (IGM) built from a 3D scene graph (3DSG) during prior exploration to model object co-occurrence relations and provide global guidance on likely target regions. It also maintains a VLM score map (VLM-SM) that fuses confidence-weighted semantic observations into the map for local validation of the current scene. Based on these two cues, we develop a planner that jointly exploits information gain and semantic evidence for online decision making. The planner biases tree expansion toward semantically salient regions with high prior likelihood and strong online relevance (IGV-RRT), while preserving kinematic feasibility through gradient-based analysis. Simulation and real-world experiments demonstrate that the proposed method effectively mitigates the impact of object rearrangement, achieving higher search efficiency and success rates than representative baselines in complex indoor environments.