ArXiv TLDR

Stop Wandering: Efficient Vision-Language Navigation via Metacognitive Reasoning

🐦 Tweet
2604.02318

Xueying Li, Feng Lyu, Hao Wu, Mingliu Liu, Jia-Nan Liu + 1 more

cs.ROcs.CV

TLDR

MetaNav improves Vision-Language Navigation efficiency and robustness using metacognitive reasoning, reducing redundant exploration and VLM queries.

Key contributions

  • MetaNav introduces metacognitive reasoning for efficient Vision-Language Navigation (VLN).
  • Integrates spatial memory, history-aware planning, and LLM-powered reflective correction.
  • History-aware planning penalizes revisiting, while reflection generates rules to avoid stagnation.
  • Achieves state-of-the-art performance and reduces VLM queries by 20.7%.

Why it matters

Existing VLN agents are inefficient due to greedy exploration and passive memory. MetaNav's metacognitive approach addresses these issues, leading to more robust and efficient navigation. This significantly advances training-free VLN by reducing computational costs and improving performance.

Original Abstract

Training-free Vision-Language Navigation (VLN) agents powered by foundation models can follow instructions and explore 3D environments. However, existing approaches rely on greedy frontier selection and passive spatial memory, leading to inefficient behaviors such as local oscillation and redundant revisiting. We argue that this stems from a lack of metacognitive capabilities: the agent cannot monitor its exploration progress, diagnose strategy failures, or adapt accordingly. To address this, we propose MetaNav, a metacognitive navigation agent integrating spatial memory, history-aware planning, and reflective correction. Spatial memory builds a persistent 3D semantic map. History-aware planning penalizes revisiting to improve efficiency. Reflective correction detects stagnation and uses an LLM to generate corrective rules that guide future frontier selection. Experiments on GOAT-Bench, HM3D-OVON, and A-EQA show that MetaNav achieves state-of-the-art performance while reducing VLM queries by 20.7%, demonstrating that metacognitive reasoning significantly improves robustness and efficiency.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.