POMDP-based Object Search with Growing State Space and Hybrid Action Domain

arXiv cs.RO / 4/17/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper tackles mobile-robot object search in cluttered indoor spaces by modeling it as a high-dimensional POMDP with a growing state space and hybrid (continuous + discrete) actions in 3D environments.
  • It proposes a new online POMDP solver, GNPF-kCT, which combines a perception module, MCTS with belief tree reuse, a neural process network to prune ineffective primitive actions, and k-center hypersphere discretization to manage large action spaces.
  • A modified UCB strategy uses belief differences and action-value estimates within cells of estimated diameters to guide MCTS expansion efficiently.
  • The method includes a “guessed target object” strategy using a grid-world model to improve search efficiency when information or rewards are limited.
  • Experiments in Gazebo (Fetch and Stretch) and real office environments show faster and more reliable localization than POMDP baselines and non-POMDP SOTA solvers, including LLM-based methods, under comparable computational and perception constraints.

Abstract

Efficiently locating target objects in complex indoor environments with diverse furniture, such as shelves, tables, and beds, is a significant challenge for mobile robots. This difficulty arises from factors like localization errors, limited fields of view, and visual occlusion. We address this by framing the object-search task as a highdimensional Partially Observable Markov Decision Process (POMDP) with a growing state space and hybrid (continuous and discrete) action spaces in 3D environments. Based on a meticulously designed perception module, a novel online POMDP solver named the growing neural process filtered k-center clustering tree (GNPF-kCT) is proposed to tackle this problem. Optimal actions are selected using Monte Carlo Tree Search (MCTS) with belief tree reuse for growing state space, a neural process network to filter useless primitive actions, and k-center clustering hypersphere discretization for efficient refinement of high-dimensional action spaces. A modified upper-confidence bound (UCB), informed by belief differences and action value functions within cells of estimated diameters, guides MCTS expansion. Theoretical analysis validates the convergence and performance potential of our method. To address scenarios with limited information or rewards, we also introduce a guessed target object with a grid-world model as a key strategy to enhance search efficiency. Extensive Gazebo simulations with Fetch and Stretch robots demonstrate faster and more reliable target localization than POMDP-based baselines and state-of-the-art (SOTA) non-POMDP-based solvers, especially large language model (LLM) based methods, in object search under the same computational constraints and perception systems. Real-world tests in office environments confirm the practical applicability of our approach. Project page: https://sites.google.com/view/gnpfkct.