Can "AI" Be a Doctor? A Study of Empathy, Readability, and Alignment in Clinical LLMs
arXiv cs.CL / 4/23/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The study evaluates how well general-purpose and clinical LLMs align with clinical communication standards by measuring semantic fidelity, readability, and affective resonance in both structured explanations and real physician–patient dialogue.
- Baseline models tend to be more affectively extreme than physicians and often increase linguistic complexity, with some larger models showing significantly higher FKGL scores than physician-authored responses.
- Empathy-oriented prompting can reduce extreme negativity and lower readability complexity, but it does not meaningfully improve semantic fidelity to physicians’ clinical content.
- Collaborative rewriting produces the strongest overall alignment, while rephrasing achieves the highest semantic similarity and also improves readability and emotional tone.
- Dual stakeholder evaluation finds no model outperforms physicians on epistemic criteria, and patients consistently prefer rewritten variants for clarity and emotional tone, suggesting LLMs should support clinical communication rather than replace expertise.
Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.
Dev.to

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans
Dev.to

Why use an AI gateway at all?
Dev.to

OpenAI Just Named It Workspace Agents. We Open-Sourced Our Lark Version Six Months Ago
Dev.to

GPT Image 2 Subject-Lock Editing: A Practical Guide to input_fidelity
Dev.to