Dialectic-Med: Mitigating Diagnostic Hallucinations via Counterfactual Adversarial Multi-Agent Debate

arXiv cs.CL / 4/14/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • Dialectic-Medは、医療向けマルチモーダルLLM(MLLM)が診断仮説を支持するために誤った画像情報を作り込む「確認バイアス/診断ハルシネーション」を、反事実的な対話(adversarial multi-agent debate)で抑制する枠組みとして提案されています。
  • 3種類の役割エージェント(提唱者・反対者=視覚的反証モジュール・調停者=重み付きコンセンサスグラフ)により、静的な合意形成ではなく動的に反証と統合を行います。
  • 反証(falsification)の認知プロセスを明示的にモデル化し、推論が「検証された視覚領域」に強く根拠づけられることを狙っています。
  • MIMIC-CXR-VQA、VQA-RAD、PathVQAでの評価では、精度の向上に加えて説明の忠実性が高まり、ハルシネーションが大きく減ると報告されています。

Abstract

Multimodal Large Language Models (MLLMs) in healthcare suffer from severe confirmation bias, often hallucinating visual details to support initial, potentially erroneous diagnostic hypotheses. Existing Chain-of-Thought (CoT) approaches lack intrinsic correction mechanisms, rendering them vulnerable to error propagation. To bridge this gap, we propose Dialectic-Med, a multi-agent framework that enforces diagnostic rigor through adversarial dialectics. Unlike static consensus models, Dialectic-Med orchestrates a dynamic interplay between three role-specialized agents: a proponent that formulates diagnostic hypotheses; an opponent equipped with a novel visual falsification module that actively retrieves contradictory visual evidence to challenge the Proponent; and a mediator that resolves conflicts via a weighted consensus graph. By explicitly modeling the cognitive process of falsification, our framework guarantees that diagnostic reasoning is tightly grounded in verified visual regions. Empirical evaluations on MIMIC-CXR-VQA, VQA-RAD, and PathVQA demonstrate that Dialectic-Med not only achieves state-of-the-art performance but also fundamentally enhances the trustworthiness of the reasoning process. Beyond accuracy, our approach significantly enhances explanation faithfulness and decisively mitigates hallucinations, establishing a new standard over single-agent baselines.