A Severity-Based Curriculum Learning Strategy for Arabic Medical Text Generation

arXiv cs.CL / 4/9/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a limitation in Arabic medical text generation/QA training where existing methods treat all samples as equally important despite varying clinical severity.
  • It proposes a Severity-based Curriculum Learning Strategy that stages fine-tuning from less severe (Mild) to more severe (Moderate/Critical) cases so the model learns basic medical patterns before harder, higher-risk scenarios.
  • The method relies on dataset partitioning using three severity labels (Mild, Moderate, Critical) that were added via a rule-based annotation approach developed in the study.
  • Experiments on a MAQA dataset subset show consistent improvements across multiple models, with reported gains of roughly +4% to +7% over baseline and +3% to +6% versus conventional fine-tuning.
  • The work aims to improve how models handle complex and potentially high-risk clinical content in Arabic, supporting more reliable native-language health guidance.

Abstract

Arabic medical text generation is increasingly needed to help users interpret symptoms and access general health guidance in their native language. Nevertheless, many existing methods assume uniform importance across training samples, overlooking differences in clinical severity. This simplification can hinder the model's ability to properly capture complex or high-risk cases. To overcome this issue, this work introduces a Severity-based Curriculum Learning Strategy for Arabic Medical Text Generation, where the training process is structured to move gradually from less severe to more critical medical conditions. The approach divides the dataset into ordered stages based on severity and incrementally exposes the model to more challenging cases during fine-tuning, allowing it to first learn basic medical patterns before addressing more complex scenarios. The proposed method is evaluated on a subset of the Medical Arabic Question Answering (MAQA) dataset, which includes Arabic medical questions describing symptoms alongside corresponding responses. In addition, the dataset is annotated with three severity levels (Mild, Moderate, and Critical) using a rule-based method developed in this study. The results demonstrate that incorporating severity-aware curriculum learning leads to consistent performance improvements across all tested models, with gains of around +4% to +7% over baseline models and +3% to +6% compared with conventional fine-tuning approaches.