MedAidDialog: A Multilingual Multi-Turn Medical Dialogue Dataset for Accessible Healthcare

arXiv cs.CL / 3/26/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • MedAidDialog is introduced as a multilingual, multi-turn medical dialogue dataset intended to better emulate realistic physician–patient consultations versus prior single-turn or template-based resources.
  • The dataset is generated synthetically using large language models and extends MDDial, then is expanded into a parallel corpus across seven languages (English, Hindi, Telugu, Tamil, Bengali, Marathi, Arabic).
  • A companion conversational medical model, MedAidLM, is presented, trained via parameter-efficient fine-tuning on quantized small language models to support deployment without high-end compute.
  • The framework supports optional patient pre-context (such as age, gender, and allergies) to personalize symptom elicitation and the resulting diagnostic recommendations.
  • Experiments report effective multi-turn symptom elicitation and diagnostic recommendation generation, with medical expert evaluation used to judge plausibility and coherence of consultations.

Abstract

Conversational artificial intelligence has the potential to assist users in preliminary medical consultations, particularly in settings where access to healthcare professionals is limited. However, many existing medical dialogue systems operate in a single-turn question--answering paradigm or rely on template-based datasets, limiting conversational realism and multilingual applicability. In this work, we introduce MedAidDialog, a multilingual multi-turn medical dialogue dataset designed to simulate realistic physician--patient consultations. The dataset extends the MDDial corpus by generating synthetic consultations using large language models and further expands them into a parallel multilingual corpus covering seven languages: English, Hindi, Telugu, Tamil, Bengali, Marathi, and Arabic. Building on this dataset, we develop MedAidLM, a conversational medical model trained using parameter-efficient fine-tuning on quantized small language models, enabling deployment without high-end computational infrastructure. Our framework additionally incorporates optional patient pre-context information (e.g., age, gender, allergies) to personalize the consultation process. Experimental results demonstrate that the proposed system can effectively perform symptom elicitation through multi-turn dialogue and generate diagnostic recommendations. We further conduct medical expert evaluation to assess the plausibility and coherence of the generated consultations.