DeCoNav: Dialog enhanced Long-Horizon Collaborative Vision-Language Navigation

arXiv cs.RO / 4/15/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

DeCoNav introduces a decentralized, dialogue-enhanced framework for long-horizon collaborative vision-language navigation in multi-robot systems, addressing limitations of prior benchmarks that lacked synchronized shared-world execution and adaptive coordination.
The method triggers event-driven dialogue to exchange compact semantic states, enabling robots to dynamically reassign subgoals and replan when new evidence, uncertainty, or cross-agent conflicts arise.
It supports real-time adaptive coordination without a central controller, relying on synchronized execution semantics tied to dialogue-triggered replanning.
DeCoNavBench implements the approach with 1,213 tasks across 176 HM3D scenes and reports a 69.2% improvement in both-success rate (BSR), indicating strong gains from dialogue-driven, dynamically reallocated planning.

Abstract

Long-horizon collaborative vision-language navigation (VLN) is critical for multi-robot systems to accomplish complex tasks beyond the capability of a single agent. CoNavBench takes a first step by introducing the first collaborative long-horizon VLN benchmark with relay-style multi-robot tasks, a collaboration taxonomy, along with graph-grounded generation and evaluation to model handoffs and rendezvous in shared environments. However, existing benchmarks and evaluations often do not enforce strictly synchronized dual-robot rollout on a shared world timeline, and they typically rely on static coordination policies that cannot adapt when new cross-agent evidence emerges. We present Dialog enhanced Long-Horizon Collaborative Vision-Language Navigation (DeCoNav), a decentralized framework that couples event-triggered dialogue with dynamic task allocation and replanning for real-time, adaptive coordination. In DeCoNav, robots exchange compact semantic states via dialogue without a central controller. When informative events such as new evidence, uncertainty, or conflicts arise, dialogue is triggered to dynamically reassign subgoals and replan under synchronized execution. Implemented in DeCoNavBench with 1,213 tasks across 176 HM3D scenes, DeCoNav improves the both-success rate (BSR) by 69.2%, demonstrating the effectiveness of dialogue-driven, dynamically reallocated planning for multi-robot collaboration.

As China’s biotech firms shift gears, can AI floor the accelerator?

SCMP Tech

AI startup claims to automate app making but actually just uses humans

Dev.to

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

"OpenAI Codex Just Got Computer Use, Image Gen, and 90 Plugins. 3 Things Nobody's Telling You."

Dev.to

AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs HallucinationEvaluation

Dev.to

DeCoNav: Dialog enhanced Long-Horizon Collaborative Vision-Language Navigation

Key Points

Abstract

Related Articles

As China’s biotech firms shift gears, can AI floor the accelerator?

AI startup claims to automate app making but actually just uses humans

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

"OpenAI Codex Just Got Computer Use, Image Gen, and 90 Plugins. 3 Things Nobody's Telling You."

AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs HallucinationEvaluation

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer