VLN-NF: Feasibility-Aware Vision-and-Language Navigation with False-Premise Instructions

arXiv cs.RO / 4/14/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces VLN-NF, a new vision-and-language navigation benchmark that tests agents under false-premise instructions where the target does not exist in the specified room.
VLN-NF requires agents to navigate, perform in-room exploration to gather evidence, and explicitly output NOT-FOUND when the target is absent.
The benchmark is created with an LLM-based instruction rewriting pipeline and a VLM-assisted verification step to ensure targets are plausibly but factually incorrectly referenced.
For evaluation, the authors propose REV-SPL to jointly score room reaching, exploration coverage, and decision correctness for the NOT-FOUND determination.
They propose ROAM, a two-stage hybrid (supervised room navigation plus LLM/VLM-guided exploration using a free-space clearance prior) that achieves the best REV-SPL compared with baselines that often under-explore and stop early.

Abstract

Conventional Vision-and-Language Navigation (VLN) benchmarks assume instructions are feasible and the referenced target exists, leaving agents ill-equipped to handle false-premise goals. We introduce VLN-NF, a benchmark with false-premise instructions where the target is absent from the specified room and agents must navigate, gather evidence through in-room exploration, and explicitly output NOT-FOUND. VLN-NF is constructed via a scalable pipeline that rewrites VLN instructions using an LLM and verifies target absence with a VLM, producing plausible yet factually incorrect goals. We further propose REV-SPL to jointly evaluate room reaching, exploration coverage, and decision correctness. To address this challenge, we present ROAM, a two-stage hybrid that combines supervised room-level navigation with LLM/VLM-driven in-room exploration guided by a free-space clearance prior. ROAM achieves the best REV-SPL among compared methods, while baselines often under-explore and terminate prematurely under unreliable instructions. VLN-NF project page can be found at https://vln-nf.github.io/.

Black Hat Asia

AI Business

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Don't forget, there is more than forgetting: new metrics for Continual Learning

Dev.to

Microsoft MAI-Image-2-Efficient Review 2026: The AI Image Model Built for Production Scale

Dev.to

Bit of a strange question?

Reddit r/artificial

VLN-NF: Feasibility-Aware Vision-and-Language Navigation with False-Premise Instructions

Key Points

Abstract

Related Articles

Black Hat Asia

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Don't forget, there is more than forgetting: new metrics for Continual Learning

Microsoft MAI-Image-2-Efficient Review 2026: The AI Image Model Built for Production Scale

Bit of a strange question?

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer