MedConcept: Unsupervised Concept Discovery for Interpretability in Medical VLMs

arXiv cs.CV / 4/15/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • MedConceptは、医療用Vision-Languageモデルの潜在表現から教師なしで「潜在医療コンセプト」を発見し、概念レベルでの解釈可能性を高める枠組みを提案している。
  • 発見したコンセプトは、共有された事前学習表現に基づくニューロン単位のスパース活性として特定され、疑似レポート形式のテキスト要約に変換されることで医師が内部の推論を点検できるようにする。
  • 既存の勾配・注意可視化がタスク特化で再利用性が乏しい点に対し、ダウンストリーム横断で再利用可能な概念ベース説明を目指している。
  • 定量評価の不足を補うために、独立した事前学習済み医療LLMを固定の外部評価者として用いる「セマンティック検証プロトコル」を導入し、放射線レポートとの整合をAligned/Unaligned/Uncertainの3指標でスコア化する。
  • コード、プロンプト、データは受理後に公開予定とされ、医療VLM解釈のベンチマークとして活用されることが期待される。

Abstract

While medical Vision-Language models (VLMs) achieve strong performance on tasks such as tumor or organ segmentation and diagnosis prediction, their opaque latent representations limit clinical trust and the ability to explain predictions. Interpretability of these multimodal representations are therefore essential for the trustworthy clinical deployment of pretrained medical VLMs. However, current interpretability methods, such as gradient- or attention-based visualizations, are often limited to specific tasks such as classification. Moreover, they do not provide concept-level explanations derived from shared pretrained representations that can be reused across downstream tasks. We introduce MedConcept, a framework that uncovers latent medical concepts in a fully unsupervised manner and grounds them in clinically verifiable textual semantics. MedConcept identifies sparse neuron-level concept activations from pretrained VLM representations and translates them into pseudo-report-style summaries, enabling physician-level inspection of internal model reasoning. To address the lack of quantitative evaluation in concept-based interpretability, we introduce a quantitative semantic verification protocol that leverages an independent pretrained medical LLM as a frozen external evaluator to assess concept alignment with radiology reports. We define three concept scores, Aligned, Unaligned, and Uncertain, to quantify semantic support, contradiction, or ambiguity relative to radiology reports and use them exclusively for post hoc evaluation. These scores provide a quantitative baseline for assessing interpretability in medical VLMs. All codes, prompt and data to be released on acceptance. Ke