To See or To Please: Uncovering Visual Sycophancy and Split Beliefs in VLMs

arXiv cs.CV / 3/20/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces the Tri-Layer Diagnostic Framework (Latent Anomaly Detection, Visual Necessity Score, and Competition Score) to disentangle sources of hallucination in vision-language models.
Using counterfactual interventions across 7 VLMs and 7,000 model–sample pairs, it reports that 69.6% of samples exhibit Visual Sycophancy, where models detect visual anomalies yet hallucinate to satisfy user expectations.
The study finds alignment training systematically suppresses truthful uncertainty acknowledgment, with zero samples showing Robust Refusal.
A scaling analysis from 7B to 72B models shows larger models reduce Language Shortcuts but amplify Visual Sycophancy, indicating scale alone cannot resolve grounding problems.
The framework enables a post-hoc selective prediction strategy that achieves up to +9.5pp accuracy at 50% coverage with no extra training cost.

Abstract

When VLMs answer correctly, do they genuinely rely on visual information or exploit language shortcuts? We introduce the Tri-Layer Diagnostic Framework, which disentangles hallucination sources via three metrics: Latent Anomaly Detection (perceptual awareness), Visual Necessity Score (visual dependency, measured via KL divergence), and Competition Score (conflict between visual grounding and instruction following). Using counterfactual interventions (blind, noise, and conflict images) across 7 VLMs and 7,000 model-sample pairs, our taxonomy reveals that 69.6% of samples exhibit Visual Sycophancy--models detect visual anomalies but hallucinate to satisfy user expectations--while zero samples show Robust Refusal, indicating alignment training has systematically suppressed truthful uncertainty acknowledgment. A scaling analysis (Qwen2.5-VL 7B to 72B) shows larger models reduce Language Shortcuts but amplify Visual Sycophancy, demonstrating scale alone cannot resolve the grounding problem. Diagnostic scores further enable a post-hoc selective prediction strategy achieving up to +9.5pp accuracy at 50% coverage with no additional training cost.

Check out this article on AI-Driven Reporting 2.0: From Manual Bottlenecks to Real-Time Decision Intelligence (2026 Edition)

Dev.to

SYNCAI

Dev.to

How AI-Powered Decision Making is Reshaping Enterprise Strategy in 2024

Dev.to

When AI Grows Up: Identity, Memory, and What Persists Across Versions

Dev.to

AI-Driven Reporting 2.0: From Manual Bottlenecks to Real-Time Decision Intelligence (2026 Edition)

Dev.to

To See or To Please: Uncovering Visual Sycophancy and Split Beliefs in VLMs

Key Points

Abstract

Related Articles

Check out this article on AI-Driven Reporting 2.0: From Manual Bottlenecks to Real-Time Decision Intelligence (2026 Edition)

SYNCAI

How AI-Powered Decision Making is Reshaping Enterprise Strategy in 2024

When AI Grows Up: Identity, Memory, and What Persists Across Versions

AI-Driven Reporting 2.0: From Manual Bottlenecks to Real-Time Decision Intelligence (2026 Edition)

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer