MEDIC-AD: Towards Medical Vision-Language Model's Clinical Intelligence
arXiv cs.CV / 3/31/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces MEDIC-AD, a clinically oriented medical vision-language model designed to make lesion detection, symptom tracking, and visual explainability more actionable for real-world clinical use.
- It uses a stage-wise framework with anomaly-aware tokens (<Ano>) to emphasize abnormal regions, improving lesion-centered representations for more accurate anomaly detection and segmentation.
- It adds inter-image difference tokens (<Diff>) to encode temporal changes across longitudinal studies, enabling the model to categorize disease trajectory as worsening, improving, or stable.
- A dedicated explainability stage trains the system to output lesion-focused heatmaps that align with the model’s reasoning and support clinically faithful visual evidence.
- Experiments on real longitudinal clinical data from hospital workflows show state-of-the-art performance and stable predictions suitable for patient monitoring and decision-support contexts.



