Probabilistic Concept Graph Reasoning for Multimodal Misinformation Detection

arXiv cs.CL / 3/27/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • Multimodal misinformation detection often fails against new manipulation tactics and relies on opaque “black-box” models, motivating a more interpretable approach.
  • The paper introduces Probabilistic Concept Graph Reasoning (PCGR), which builds a human-understandable concept graph from multimodal inputs and then performs hierarchical attention reasoning over that graph to judge claim veracity.
  • PCGR is designed to be interpretable and evolvable by automatically discovering and validating novel high-level concepts using multimodal large language models (MLLMs).
  • Experiments reported in the abstract indicate state-of-the-art accuracy and improved robustness, outperforming prior methods for both coarse detection and fine-grained manipulation recognition.
  • The core contribution reframes MMD as structured, concept-based reasoning, producing traceable reasoning chains that link evidence to conclusions.

Abstract

Multimodal misinformation poses an escalating challenge that often evades traditional detectors, which are opaque black boxes and fragile against new manipulation tactics. We present Probabilistic Concept Graph Reasoning (PCGR), an interpretable and evolvable framework that reframes multimodal misinformation detection (MMD) as structured and concept-based reasoning. PCGR follows a build-then-infer paradigm, which first constructs a graph of human-understandable concept nodes, including novel high-level concepts automatically discovered and validated by multimodal large language models (MLLMs), and then applies hierarchical attention over this concept graph to infer claim veracity. This design produces interpretable reasoning chains linking evidence to conclusions. Experiments demonstrate that PCGR achieves state-of-the-art MMD accuracy and robustness to emerging manipulation types, outperforming prior methods in both coarse detection and fine-grained manipulation recognition.
広告