VERI-DPO: Evidence-Aware Alignment for Clinical Summarization via Claim Verification and Direct Preference Optimization
arXiv cs.CL / 3/12/2026
💬 OpinionModels & Research
Key Points
- VERI-DPO combines claim verification with Direct Preference Optimization to train a summarizer that stays faithful to fragmented EHR evidence using a retrieval-augmented verifier.
- It labels claim-evidence pairs as Supported, Not Supported, or Not Addressed and uses these signals to derive length-controlled, contradiction-anchored preference pairs for learning.
- On held-out ICU patients in MIMIC-III-Ext-VeriFact-BHC, Not Supported rates drop from 10.7% to 1.9% (local verifier) and 11.6% to 6.4% (GPT-4o), and validity rises from 76.7% to 82.5%.
- The approach aims to reduce omissions and unsupported statements in LLM-based clinical summarization, improving reliability without sacrificing informative length.
Related Articles

報告:LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測
note

諸葛亮 孔明老師(ChatGPTのロールプレイ)との対話 その肆拾伍『銀河文明・ダークマターエンジン』
note

GPT-5.4 mini/nano登場!―2倍高速で無料プランも使える小型高性能モデル
note
Why a Perfect-Memory AI Agent Without Persona Drift is Architecturally Impossible
Dev.to
Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum
arXiv cs.LG