When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs

arXiv cs.CV / 4/24/2026

📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research

共有:

Key Points

The paper studies hallucinations in large vision-language models (LVLMs) by testing how much different factors—especially the language side versus the vision backbone—contribute to ungrounded outputs.
It introduces HalluScope, a new benchmark designed to disentangle the causes of LVLM hallucinations.
The findings suggest hallucinations are driven primarily by overreliance on textual priors and background knowledge, with particular influence from information injected via textual instructions.
To reduce instruction-induced hallucinations, the authors propose HalluVL-DPO, a fine-tuning framework that uses preference optimization on a curated dataset to favor visually grounded responses over hallucinations.
The optimized model is reported to mitigate the targeted hallucination mode while maintaining or improving performance on other hallucination and visual capability evaluations, with code and datasets planned for public release.

Abstract

Despite impressive progress in capabilities of large vision-language models (LVLMs), these systems remain vulnerable to hallucinations, i.e., outputs that are not grounded in the visual input. Prior work has attributed hallucinations in LVLMs to factors such as limitations of the vision backbone or the dominance of the language component, yet the relative importance of these factors remains unclear. To resolve this ambiguity, We propose HalluScope, a benchmark to better understand the extent to which different factors induce hallucinations. Our analysis indicates that hallucinations largely stem from excessive reliance on textual priors and background knowledge, especially information introduced through textual instructions. To mitigate hallucinations induced by textual instruction priors, we propose HalluVL-DPO, a framework for fine-tuning off-the-shelf LVLMs towards more visually grounded responses. HalluVL-DPO leverages preference optimization using a curated training dataset that we construct, guiding the model to prefer grounded responses over hallucinated ones. We demonstrate that our optimized model effectively mitigates the targeted hallucination failure mode, while preserving or improving performance on other hallucination benchmarks and visual capability evaluations. To support reproducibility and further research, we will publicly release our evaluation benchmark, preference training dataset, and code at https://pegah-kh.github.io/projects/prompts-override-vision/ .