VCE: A zero-cost hallucination mitigation method of LVLMs via visual contrastive editing
arXiv cs.CL / 4/22/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- Large vision-language models often produce object hallucinations—describing objects not present in the input—which is especially risky in domains like medical imaging and autonomous driving.
- The paper argues that hallucinations are driven in part by language priors learned during pretraining, which bias the model toward statistically likely words.
- It proposes Visual Contrastive Editing (VCE), a label-free, post-hoc method that uses contrastive visual perturbations to detect and suppress hallucination tendencies.
- VCE applies targeted parameter edits based on SVD-based decomposition to isolate hallucination-relevant subspaces, avoiding the need for fine-tuning or labeled data.
- Experiments show VCE reduces object hallucination on multiple benchmarks while preserving the model’s original computational efficiency.


