MRG-R1: Reinforcement Learning for Clinically Aligned Medical Report Generation

arXiv cs.CL / 3/30/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • 医用画像から放射線科レポートを生成するタスクで、従来のトークン単位の尤度学習は表面的な文字列一致を優先し、医学的正しさが目的関数に十分に反映されないという課題を指摘しています。
  • このギャップを埋めるため、MRG-R1はレポート全体の臨床的正確性を直接最適化する意味駆動型の強化学習(SRL)フレームワークを提案しています。
  • 中核となるのは臨床に根差したレポートレベル報酬関数で、生成文と参照文の「臨床的に重要な所見」に関する意味的な一致を強化し、表層の言語整合以上の学習制約を与えます。
  • 評価では、臨床的に関連する所見の精度とカバレッジが改善され、IU X-RayおよびMIMIC-CXRのベンチマークで最先端(SOTA)の臨床的有効性を達成したと報告されています。

Abstract

Medical report generation aims to automatically produce radiology-style reports from medical images, supporting efficient and accurate clinical decision-making.However, existing approaches predominately rely on token-level likelihood training, which favors local lexical matching and leaves clinical correctness under-specified in the training objective. This behavior can be attributed to token-level likelihood optimization, which rewards surface-form agreement and therefore fails to directly encode constraints on medically accurate findings. To address this objective mismatch, we introduce a semantic-driven reinforcement learning (SRL) framework for medical report generation, named MRG-R1, which directly optimizes report-level clinical correctness rather than token-level likelihood. The key module is a clinically grounded report-level reward function, which reinforces semantic agreement in clinically relevant findings between generated and reference reports, thereby enabling learning signals that explicitly constrain medical correctness beyond surface linguistic alignment. Our evaluations show that the proposed framework improves the accuracy and coverage of clinically relevant findings in generated reports, and that MRG-R1 achieves state-of-the-art clinical efficacy on the IU X-Ray and MIMIC-CXR benchmark datasets.