ArXiv TLDR

VULCAN: Vision-Language-Model Enhanced Multi-Agent Cooperative Navigation for Indoor Fire-Disaster Response

🐦 Tweet
2604.12831

Shengding Liu, Qiben Yan

cs.RO

TLDR

VULCAN is a VLM-enhanced multi-agent navigation framework for indoor fire disaster response, addressing challenges of smoke, heat, and sensor degradation.

Key contributions

  • Introduces VULCAN, a VLM-enhanced multi-agent navigation framework for fire disaster response.
  • Leverages multi-modal perception and Vision-Language Models for robust navigation in extreme conditions.
  • Extends Habitat-Matterport3D with realistic fire simulations, including smoke, heat, and sensor degradation.
  • Identifies critical failure modes of existing methods, highlighting need for hazard-aware planning.

Why it matters

This paper tackles the critical challenge of autonomous search and rescue in indoor fire disasters, where existing navigation systems fail. VULCAN offers a robust, VLM-enhanced multi-agent solution. It also provides a valuable benchmark for developing more reliable systems for extreme environments.

Original Abstract

Indoor fire disasters pose severe challenges to autonomous search and rescue due to dense smoke, high temperatures, and dynamically evolving indoor environments. In such time-critical scenarios, multi-agent cooperative navigation is particularly useful, as it enables faster and broader exploration than single-agent approaches. However, existing multi-agent navigation systems are primarily vision-based and designed for benign indoor settings, leading to significant performance degradation under fire-driven dynamic conditions. In this paper, we present VULCAN, a multi-agent cooperative navigation framework based on multi-modal perception and vision-language models (VLMs), tailored for indoor fire disaster response. We extend the Habitat-Matterport3D benchmark by simulating physically realistic fire scenarios, including smoke diffusion, thermal hazards, and sensor degradation. We evaluate representative multi-agent cooperative navigation baselines under both normal and fire-driven environments. Our results reveal critical failure modes of existing methods in fire scenarios and underscore the necessity of robust perception and hazard-aware planning for reliable multi-agent search and rescue.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.