Performance Evaluation of Open-Source Large Language Models for Assisting Pathology Report Writing in Japanese

arXiv cs.CL / 3/13/2026

📰 NewsModels & Research

共有:

Key Points

The paper evaluates seven open-source LLMs on three tasks related to Japanese pathology report writing: generation and information extraction of predefined diagnosis formats, correction of typographical errors in reports, and subjective evaluation of model-generated explanations by pathologists and clinicians.
Thinking models and medical-specialized models showed advantages in structured reporting tasks that require reasoning and in typo correction.
Preferences for explanatory outputs varied substantially across raters, indicating inconsistent acceptance of model-generated explanations in clinical practice.
The study concludes that open-source LLMs can be useful for assisting Japanese pathology report writing in limited but clinically relevant scenarios.

Abstract

The performance of large language models (LLMs) for supporting pathology report writing in Japanese remains unexplored. We evaluated seven open-source LLMs from three perspectives: (A) generation and information extraction of pathology diagnosis text following predefined formats, (B) correction of typographical errors in Japanese pathology reports, and (C) subjective evaluation of model-generated explanatory text by pathologists and clinicians. Thinking models and medical-specialized models showed advantages in structured reporting tasks that required reasoning and in typo correction. In contrast, preferences for explanatory outputs varied substantially across raters. Although the utility of LLMs differed by task, our findings suggest that open-source LLMs can be useful for assisting Japanese pathology report writing in limited but clinically relevant scenarios.

When AI Grows Up: Identity, Memory, and What Persists Across Versions

Dev.to

OpenAI is throwing everything into building a fully automated researcher

MIT Technology Review

Kimi just published a paper replacing residual connections in transformers. results look legit

Reddit r/LocalLLaMA

機械学習の最適化対象まとめ（E資格対策にも）

Qiita

14 Best Self-Hosted Claude Alternatives for AI and Coding in 2026

Dev.to

Performance Evaluation of Open-Source Large Language Models for Assisting Pathology Report Writing in Japanese

Key Points

Abstract

Related Articles

When AI Grows Up: Identity, Memory, and What Persists Across Versions

OpenAI is throwing everything into building a fully automated researcher

Kimi just published a paper replacing residual connections in transformers. results look legit

機械学習の最適化対象まとめ（E資格対策にも）

14 Best Self-Hosted Claude Alternatives for AI and Coding in 2026

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer