HTDC: Hesitation-Triggered Differential Calibration for Mitigating Hallucination in Large Vision-Language Models
arXiv cs.CV / 4/15/2026
📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper identifies that hallucinations in large vision-language models can stem from unstable visual grounding combined with over-reliance on language priors.
- It proposes Hesitation-Triggered Differential Calibration (HTDC), a training-free decoding method that applies calibration only at layer-wise “hesitation” steps rather than at every token.
- The hesitation signal is derived from fluctuations in token preference across intermediate layers, used to detect grounding instability.
- When triggered, HTDC compares the standard full-branch inference against two lightweight probes (visual-nullification and semantic-nullification) to suppress hallucination-prone candidates.
- Experiments on hallucination benchmarks show HTDC reduces hallucinations while preserving task accuracy and lowering computation versus per-step calibration.




