AI Navigate

Performance Evaluation of Open-Source Large Language Models for Assisting Pathology Report Writing in Japanese

arXiv cs.CL / 3/13/2026

📰 NewsModels & Research

Key Points

  • The paper evaluates seven open-source LLMs on three tasks related to Japanese pathology report writing: generation and information extraction of predefined diagnosis formats, correction of typographical errors in reports, and subjective evaluation of model-generated explanations by pathologists and clinicians.
  • Thinking models and medical-specialized models showed advantages in structured reporting tasks that require reasoning and in typo correction.
  • Preferences for explanatory outputs varied substantially across raters, indicating inconsistent acceptance of model-generated explanations in clinical practice.
  • The study concludes that open-source LLMs can be useful for assisting Japanese pathology report writing in limited but clinically relevant scenarios.

Abstract

The performance of large language models (LLMs) for supporting pathology report writing in Japanese remains unexplored. We evaluated seven open-source LLMs from three perspectives: (A) generation and information extraction of pathology diagnosis text following predefined formats, (B) correction of typographical errors in Japanese pathology reports, and (C) subjective evaluation of model-generated explanatory text by pathologists and clinicians. Thinking models and medical-specialized models showed advantages in structured reporting tasks that required reasoning and in typo correction. In contrast, preferences for explanatory outputs varied substantially across raters. Although the utility of LLMs differed by task, our findings suggest that open-source LLMs can be useful for assisting Japanese pathology report writing in limited but clinically relevant scenarios.