Is a Picture Worth a Thousand Words? Adaptive Multimodal Fact-Checking with Visual Evidence Necessity

arXiv cs.CL / 4/7/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that adding visual evidence to multimodal fact-checking does not always improve accuracy and can sometimes reduce it when used indiscriminately.
It introduces AMuFC, a framework that adaptively decides when visual evidence is necessary using an Analyzer, while a Verifier predicts claim veracity conditioned on both evidence and that decision.
Experiments on three datasets show that using the Analyzer’s visual-evidence-necessity assessment substantially improves verification performance.
The authors also release WebFC, a newly constructed dataset intended to evaluate fact-checking modules in more realistic settings, alongside released code.

Abstract

Automated fact-checking is a crucial task not only in journalism but also across web platforms, where it supports a responsible information ecosystem and mitigates the harms of misinformation. While recent research has progressed from text-only to multimodal fact-checking, a prevailing assumption is that incorporating visual evidence universally improves performance. In this work, we challenge this assumption and show that indiscriminate use of multimodal evidence can reduce accuracy. To address this challenge, we propose AMuFC, a multimodal fact-checking framework that employs two collaborative agents with distinct roles for the adaptive use of visual evidence: An Analyzer determines whether visual evidence is necessary for claim verification, and a Verifier predicts claim veracity conditioned on both the retrieved evidence and the Analyzer's assessment. Experimental results on three datasets show that incorporating the Analyzer's assessment of visual evidence necessity into the Verifier's prediction yields substantial improvements in verification performance. In addition to all code, we release WebFC, a newly constructed dataset for evaluating fact-checking modules in a more realistic scenario, available at https://github.com/ssu-humane/AMuFC.