POMDP-based Object Search with Growing State Space and Hybrid Action Domain
Yongbo Chen, Hesheng Wang, Shoudong Huang, Hanna Kurniawati
TLDR
This paper introduces GNPF-kCT, a novel POMDP solver for robot object search in complex 3D environments, outperforming baselines.
Key contributions
- Proposes GNPF-kCT, an online POMDP solver for object search with growing state and hybrid action spaces.
- Uses MCTS with belief tree reuse, a neural process filter, and k-center clustering for action space refinement.
- Introduces a guessed target object strategy and a modified UCB for enhanced search efficiency.
- Demonstrates superior performance over POMDP and SOTA non-POMDP baselines in simulations and real-world tests.
Why it matters
This research significantly advances robotic object search by tackling the complexities of real-world environments. Its novel POMDP solver, GNPF-kCT, offers a more efficient and reliable approach than existing methods, including LLM-based systems. This could lead to more autonomous and capable robots.
Original Abstract
Efficiently locating target objects in complex indoor environments with diverse furniture, such as shelves, tables, and beds, is a significant challenge for mobile robots. This difficulty arises from factors like localization errors, limited fields of view, and visual occlusion. We address this by framing the object-search task as a highdimensional Partially Observable Markov Decision Process (POMDP) with a growing state space and hybrid (continuous and discrete) action spaces in 3D environments. Based on a meticulously designed perception module, a novel online POMDP solver named the growing neural process filtered k-center clustering tree (GNPF-kCT) is proposed to tackle this problem. Optimal actions are selected using Monte Carlo Tree Search (MCTS) with belief tree reuse for growing state space, a neural process network to filter useless primitive actions, and k-center clustering hypersphere discretization for efficient refinement of high-dimensional action spaces. A modified upper-confidence bound (UCB), informed by belief differences and action value functions within cells of estimated diameters, guides MCTS expansion. Theoretical analysis validates the convergence and performance potential of our method. To address scenarios with limited information or rewards, we also introduce a guessed target object with a grid-world model as a key strategy to enhance search efficiency. Extensive Gazebo simulations with Fetch and Stretch robots demonstrate faster and more reliable target localization than POMDP-based baselines and state-of-the-art (SOTA) non-POMDP-based solvers, especially large language model (LLM) based methods, in object search under the same computational constraints and perception systems. Real-world tests in office environments confirm the practical applicability of our approach. Project page: https://sites.google.com/view/gnpfkct.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.