VULCAN: Vision-Language-Model Enhanced Multi-Agent Cooperative Navigation for Indoor Fire-Disaster Response

arXiv cs.RO / 4/15/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces VULCAN, a multi-agent cooperative navigation framework designed specifically for indoor fire disaster response by combining multi-modal perception with vision-language models (VLMs).
It argues that existing multi-agent navigation systems—typically vision-only and built for benign environments—suffer major performance drops under fire-specific dynamics like smoke, heat, and changing layouts.
The authors extend the Habitat-Matterport3D benchmark with physically realistic fire simulations, including smoke diffusion, thermal hazards, and sensor degradation, to enable more credible evaluations.
Experiments compare multiple baseline cooperative navigation approaches in both normal and fire-driven settings, identifying critical failure modes and highlighting the need for robust, hazard-aware perception and planning.

Abstract

Indoor fire disasters pose severe challenges to autonomous search and rescue due to dense smoke, high temperatures, and dynamically evolving indoor environments. In such time-critical scenarios, multi-agent cooperative navigation is particularly useful, as it enables faster and broader exploration than single-agent approaches. However, existing multi-agent navigation systems are primarily vision-based and designed for benign indoor settings, leading to significant performance degradation under fire-driven dynamic conditions. In this paper, we present VULCAN, a multi-agent cooperative navigation framework based on multi-modal perception and vision-language models (VLMs), tailored for indoor fire disaster response. We extend the Habitat-Matterport3D benchmark by simulating physically realistic fire scenarios, including smoke diffusion, thermal hazards, and sensor degradation. We evaluate representative multi-agent cooperative navigation baselines under both normal and fire-driven environments. Our results reveal critical failure modes of existing methods in fire scenarios and underscore the necessity of robust perception and hazard-aware planning for reliable multi-agent search and rescue.

Black Hat Asia

AI Business

Vibe Coding Is Changing How We Build Software. ERP Teams Should Pay Attention

Dev.to

I scanned every major vibe coding tool for security. None scored above 90.

Dev.to

I Finally Checked What My AI Coding Tools Actually Cost. The Number Made No Sense.

Dev.to

Is it actually possible to build a model-agnostic persistent text layer that keeps AI behavior stable?

Reddit r/artificial

VULCAN: Vision-Language-Model Enhanced Multi-Agent Cooperative Navigation for Indoor Fire-Disaster Response

Key Points

Abstract

Related Articles

Black Hat Asia

Vibe Coding Is Changing How We Build Software. ERP Teams Should Pay Attention

I scanned every major vibe coding tool for security. None scored above 90.

I Finally Checked What My AI Coding Tools Actually Cost. The Number Made No Sense.

Is it actually possible to build a model-agnostic persistent text layer that keeps AI behavior stable?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer