DERM-3R: A Resource-Efficient Multimodal Agents Framework for Dermatologic Diagnosis and Treatment in Real-World Clinical Settings

arXiv cs.AI / 4/14/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • DERM-3Rは、TCM(伝統的中国医学)の皮膚科領域における診断・治療を、限られたデータと計算資源の下で扱うことを目的にした省リソースなマルチモーダルエージェント・フレームワークである。
  • 具体的には、(1) 微細な病変認識、(2) 複数視点での病変表現と専門家レベルの病因モデリング、(3) 症候分化と治療計画のための全体的推論、という3つの意思決定課題に分解して設計されている。
  • DERM-Rec / DERM-Rep / DERM-Reasonの3つの協調エージェントを用い、軽量なマルチモーダルLLMをベースに103件の実臨床のTCM乾癬症例で部分的にファインチューニングした。
  • 自動評価、LLM-as-a-judge、医師評価の複数手法で検証した結果、少ないデータ・パラメータ更新にもかかわらず、大規模な汎用マルチモーダルモデルに匹敵、または上回る性能が報告されている。
  • この研究は、皮膚科のような複雑な臨床・統合医療タスクに対して、スケール依存(ブルートフォース)ではなく、構造化されたドメイン知識とマルチエージェント設計が実用的な代替になり得ることを示唆している。

Abstract

Dermatologic diseases impose a large and growing global burden, affecting billions and substantially reducing quality of life. While modern therapies can rapidly control acute symptoms, long-term outcomes are often limited by single-target paradigms, recurrent courses, and insufficient attention to systemic comorbidities. Traditional Chinese medicine (TCM) provides a complementary holistic approach via syndrome differentiation and individualized treatment, but practice is hindered by non-standardized knowledge, incomplete multimodal records, and poor scalability of expert reasoning. We propose DERM-3R, a resource-efficient multimodal agent framework to model TCM dermatologic diagnosis and treatment under limited data and compute. Based on real-world workflows, we reformulate decision-making into three core issues: fine-grained lesion recognition, multi-view lesion representation with specialist-level pathogenesis modeling, and holistic reasoning for syndrome differentiation and treatment planning. DERM-3R comprises three collaborative agents: DERM-Rec, DERM-Rep, and DERM-Reason, each targeting one component of this pipeline. Built on a lightweight multimodal LLM and partially fine-tuned on 103 real-world TCM psoriasis cases, DERM-3R performs strongly across dermatologic reasoning tasks. Evaluations using automatic metrics, LLM-as-a-judge, and physician assessment show that despite minimal data and parameter updates, DERM-3R matches or surpasses large general-purpose multimodal models. These results suggest structured, domain-aware multi-agent modeling can be a practical alternative to brute-force scaling for complex clinical tasks in dermatology and integrative medicine.