Emulating Clinician Cognition via Self-Evolving Deep Clinical Research

arXiv cs.AI / 3/12/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces DxEvolve, a self-evolving diagnostic agent that uses an interactive clinical research workflow to autonomously requisition examinations and externalize clinical experience as cognition primitives.
On the MIMIC-CDM benchmark, DxEvolve improved diagnostic accuracy by 11.2% over backbone models and reached 90.4% on a reader-study subset, comparable to clinician reference (88.8%).
It also improved accuracy on an independent external cohort by 10.2% for covered categories and 17.1% for uncovered categories compared to a competitive method.
The approach aims to provide an accountable, governable pathway for continual evolution of clinical AI.

Abstract

Clinical diagnosis is a complex cognitive process, grounded in dynamic cue acquisition and continuous expertise accumulation. Yet most current artificial intelligence (AI) systems are misaligned with this reality, treating diagnosis as single-pass retrospective prediction while lacking auditable mechanisms for governed improvement. We developed DxEvolve, a self-evolving diagnostic agent that bridges these gaps through an interactive deep clinical research workflow. The framework autonomously requisitions examinations and continually externalizes clinical experience from increasing encounter exposure as diagnostic cognition primitives. On the MIMIC-CDM benchmark, DxEvolve improved diagnostic accuracy by 11.2% on average over backbone models and reached 90.4% on a reader-study subset, comparable to the clinician reference (88.8%). DxEvolve improved accuracy on an independent external cohort by 10.2% (categories covered by the source cohort) and 17.1% (uncovered categories) compared to the competitive method. By transforming experience into a governable learning asset, DxEvolve supports an accountable pathway for the continual evolution of clinical AI.