VERI-DPO: Evidence-Aware Alignment for Clinical Summarization via Claim Verification and Direct Preference Optimization

arXiv cs.CL / 3/12/2026

💬 OpinionModels & Research

共有:

Key Points

VERI-DPO combines claim verification with Direct Preference Optimization to train a summarizer that stays faithful to fragmented EHR evidence using a retrieval-augmented verifier.
It labels claim-evidence pairs as Supported, Not Supported, or Not Addressed and uses these signals to derive length-controlled, contradiction-anchored preference pairs for learning.
On held-out ICU patients in MIMIC-III-Ext-VeriFact-BHC, Not Supported rates drop from 10.7% to 1.9% (local verifier) and 11.6% to 6.4% (GPT-4o), and validity rises from 76.7% to 82.5%.
The approach aims to reduce omissions and unsupported statements in LLM-based clinical summarization, improving reliability without sacrificing informative length.

Abstract

Brief Hospital Course (BHC) narratives must be clinically useful yet faithful to fragmented EHR evidence. LLM-based clinical summarizers still introduce unsupported statements, and alignment can encourage omissions ("say-less" degeneration). We introduce VERI-DPO, which uses claim verification to mine preferences and distill them into the summarizer with Direct Preference Optimization (DPO). On MIMIC-III-Ext-VeriFact-BHC (100 ICU patients; patient-level splits), we train a retrieval-augmented verifier to label claim-evidence pairs as Supported, Not Supported, or Not Addressed via a single-token format. The verifier scores sentence-level claims from sampled BHC candidates and aggregates margins into a coverage-aware utility to mine length-controlled, contradiction-anchored preference pairs. On held-out patients, verifier-mined preferences separate candidates by contradiction density, and VERI-DPO reduces Not Supported claim rates from 10.7% to 1.9% (local verifier judge) and from 11.6% to 6.4% (GPT-4o judge), while improving validity from 76.7% to 82.5% and maintaining informative length.

報告：LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測

note

諸葛亮孔明老師(ChatGPTのﾛｰﾙﾌﾟﾚｲ)との対話その肆拾伍『銀河文明･ダークマターエンジン』

note

GPT-5.4 mini/nano登場！―2倍高速で無料プランも使える小型高性能モデル

note

Why a Perfect-Memory AI Agent Without Persona Drift is Architecturally Impossible

Dev.to

Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum

arXiv cs.LG

VERI-DPO: Evidence-Aware Alignment for Clinical Summarization via Claim Verification and Direct Preference Optimization

Key Points

Abstract

Related Articles

報告：LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測

諸葛亮孔明老師(ChatGPTのﾛｰﾙﾌﾟﾚｲ)との対話その肆拾伍『銀河文明･ダークマターエンジン』

GPT-5.4 mini/nano登場！―2倍高速で無料プランも使える小型高性能モデル

Why a Perfect-Memory AI Agent Without Persona Drift is Architecturally Impossible

Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Abstract

Related Articles

​報告：LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測

諸葛亮 孔明老師(ChatGPTのﾛｰﾙﾌﾟﾚｲ)との対話 その肆拾伍『銀河文明･ダークマターエンジン』

GPT-5.4 mini/nano登場！―2倍高速で無料プランも使える小型高性能モデル

Why a Perfect-Memory AI Agent Without Persona Drift is Architecturally Impossible

Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

報告：LLMにおける「自己言及的再帰」と「ステートフル・エミュレーション」の観測

諸葛亮孔明老師(ChatGPTのﾛｰﾙﾌﾟﾚｲ)との対話その肆拾伍『銀河文明･ダークマターエンジン』