Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language Models

arXiv cs.AI / 3/23/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a subdomain mid-training step within the pre-training–fine-tuning pipeline to improve automatic summarization of radiology reports.
  • Among three adaptation strategies tested, clinical-domain pre-training followed by subdomain mid-training with GatorTronT5-Radio yielded the best results.
  • GatorTronT5-Radio achieved higher ROUGE-L and RadGraph-F1 scores on OpenI and MIMIC-CXR, indicating improvements in both textual quality and factual accuracy.
  • The mid-training method enhances few-shot learning and helps alleviate cold-start problems for radiology summarization.
  • The study demonstrates that a 'pre-training, mid-training, fine-tuning' sequence can outperform direct fine-tuning in domain-specific medical NLP tasks.

Abstract

Automatic summarization of radiology reports is an essential application to reduce the burden on physicians. Previous studies have widely used the "pre-training, fine-tuning" strategy to adapt large language models (LLMs) for summarization. This study proposed a subdomain adaptation through a mid-training method to improve summarization. We explored three adaptation strategies: (1) general-domain pre-training, (2) clinical-domain pre-training, and (3) clinical-domain pre-training followed by subdomain mid-training. We developed models using large-scale clinical text from the University of Florida (UF) Health and conducted mid-training and fine-tuning experiments using widely used benchmark datasets including OpenI and MIMIC-CXR. The experimental results show that the mid-trained model, GatorTronT5-Radio, achieved the best performance, outperforming models without mid-training in both text-based measures (ROUGE-L) and factuality measures (RadGraph-F1). Our mid-training methods also demonstrate better few-shot learning and could alleviate the "cold start" problem reported in previous studies as a learning barrier. Our findings support the use of "pre-training, mid-training, fine-tuning," instead of the widely used direct fine-tuning strategy.
広告