MedCausalX: Adaptive Causal Reasoning with Self-Reflection for Trustworthy Medical Vision-Language Models

arXiv cs.AI / 3/25/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper argues that current medical vision-language chain-of-thought (CoT) models struggle with causal reasoning, making them prone to spurious correlations and reducing clinical reliability.
  • It proposes MedCausalX, an end-to-end framework that explicitly models causal reasoning chains in medical VLMs using a two-stage adaptive reflection mechanism with dedicated causal and verification tokens.
  • The authors introduce the CRMed dataset, which includes fine-grained anatomical annotations, structured causal reasoning chains, and counterfactual variants to teach causal relationships beyond shortcuts.
  • MedCausalX is trained with a trajectory-level causal correction objective using error-attributed reinforcement learning to improve causal consistency across reasoning paths.
  • Experiments on multiple benchmarks reportedly show improved diagnostic consistency (+5.4 points), reduced hallucinations by over 10 points, and strong spatial grounding performance (IoU), outperforming prior methods.

Abstract

Vision-Language Models (VLMs) have enabled interpretable medical diagnosis by integrating visual perception with linguistic reasoning. Yet, existing medical chain-of-thought (CoT) models lack explicit mechanisms to represent and enforce causal reasoning, leaving them vulnerable to spurious correlations and limiting their clinical reliability. We pinpoint three core challenges in medical CoT reasoning: how to adaptively trigger causal correction, construct high-quality causal-spurious contrastive samples, and maintain causal consistency across reasoning trajectories. To address these challenges, we propose MedCausalX, an end-to-end framework explicitly models causal reasoning chains in medical VLMs. We first introduce the CRMed dataset providing fine-grained anatomical annotations, structured causal reasoning chains, and counterfactual variants that guide the learning of causal relationships beyond superficial correlations. Building upon CRMed, MedCausalX employs a two-stage adaptive reflection architecture equipped with \langlecausal\rangle and \langleverify\rangle tokens, enabling the model to autonomously determine when and how to perform causal analysis and verification. Finally, a trajectory-level causal correction objective optimized through error-attributed reinforcement learning refines the reasoning chain, allowing the model to distinguish genuine causal dependencies from shortcut associations. Extensive experiments on multiple benchmarks show that MedCausalX consistently outperforms state-of-the-art methods, improving diagnostic consistency by +5.4 points, reducing hallucination by over 10 points, and attaining top spatial grounding IoU, thereby setting a new standard for causally grounded medical reasoning.