Mind the Rarities: Can Rare Skin Diseases Be Reliably Diagnosed via Diagnostic Reasoning?
arXiv cs.CV / 3/20/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- DermCase is introduced as a long-context benchmark for diagnosing rare dermatology conditions derived from peer‑reviewed case reports, including 26,030 multi-modal image-text pairs and 6,354 clinically challenging cases.
- The dataset uses DermLIP-based similarity metrics to evaluate differential diagnosis quality and aligns better with dermatologists than existing metrics.
- Benchmarking 22 leading LVLMs reveals significant deficiencies in diagnosis accuracy, differential diagnosis, and clinical reasoning for rare conditions.
- Fine-tuning via instruction tuning substantially improves performance, while Direct Preference Optimization (DPO) yields minimal gains; systematic error analysis highlights current models' reasoning limitations.
Related Articles
[R] Combining Identity Anchors + Permission Hierarchies achieves 100% refusal in abliterated LLMs — system prompt only, no fine-tuning
Reddit r/MachineLearning
How I Built an AI SDR Agent That Finds Leads and Writes Personalized Cold Emails
Dev.to
Complete Guide: How To Make Money With Ai
Dev.to
I Analyzed My Portfolio with AI and Scored 53/100 — Here's How I Fixed It to 85+
Dev.to
The Demethylation
Dev.to