Evaluating Fine-Tuned LLM Model For Medical Transcription With Small Low-Resource Languages Validated Dataset

arXiv cs.AI / 3/27/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies whether a domain-aligned LLM fine-tuned for medical transcription can reduce clinical documentation burden, focusing on Finnish as a low-resource language.
  • It fine-tunes LLaMA 3.1-8B using a small, validated corpus created from simulated clinical conversations and applies controlled preprocessing and optimization.
  • Evaluation with sevenfold cross-validation reports low n-gram overlap (BLEU 0.1214) but high semantic alignment (ROUGE-L 0.4982 and BERTScore F1 0.8230).
  • The authors conclude that fine-tuning can be effective for translating medical discourse in spoken Finnish and supports the feasibility of privacy-oriented, domain-specific clinical LLMs.
  • The study outlines directions for future work, aiming to advance clinically relevant transcription quality for low-resource languages.

Abstract

Clinical documentation is a critical factor for patient safety, diagnosis, and continuity of care. The administrative burden of EHRs is a significant factor in physician burnout. This is a critical issue for low-resource languages, including Finnish. This study aims to investigate the effectiveness of a domain-aligned natural language processing (NLP); large language model for medical transcription in Finnish by fine-tuning LLaMA 3.1-8B on a small validated corpus of simulated clinical conversations by students at Metropolia University of Applied Sciences. The fine-tuning process for medical transcription used a controlled preprocessing and optimization approach. The fine-tuning effectiveness was evaluated by sevenfold cross-validation. The evaluation metrics for fine-tuned LLaMA 3.1-8B were BLEU = 0.1214, ROUGE-L = 0.4982, and BERTScore F1 = 0.8230. The results showed a low n-gram overlap but a strong semantic similarity with reference transcripts. This study indicate that fine-tuning can be an effective approach for translation of medical discourse in spoken Finnish and support the feasibility of fine-tuning a privacy-oriented domain-specific large language model for clinical documentation in Finnish. Beside that provide directions for future work.