AI Navigate

LLM-Augmented Computational Phenotyping of Long Covid

arXiv cs.LG / 3/20/2026

📰 NewsModels & Research

Key Points

  • The paper introduces Grace Cycle, an LLM-augmented computational phenotyping framework that iteratively integrates hypothesis generation, evidence extraction, and feature refinement to discover clinically meaningful subgroups from longitudinal patient data.
  • It reports the identification of three phenotypes in 13,511 Long Covid participants—Protected, Responder, and Refractory—characterized by distinct patterns in peak symptom severity, baseline disease burden, and longitudinal dose-response, with strong statistical support.
  • The framework demonstrates how large language models can be integrated into a principled, statistically grounded pipeline for phenotypic screening from complex longitudinal data.
  • The authors note that the approach is disease-agnostic and offers a general method for discovering interpretable subphenotypes, suggesting potential applicability beyond Long Covid.

Abstract

Phenotypic characterization is essential for understanding heterogeneity in chronic diseases and for guiding personalized interventions. Long COVID, a complex and persistent condition, yet its clinical subphenotypes remain poorly understood. In this work, we propose an LLM-augmented computational phenotyping framework ``Grace Cycle'' that iteratively integrates hypothesis generation, evidence extraction, and feature refinement to discover clinically meaningful subgroups from longitudinal patient data. The framework identifies three distinct clinical phenotypes, Protected, Responder, and Refractory, based on 13,511 Long Covid participants. These phenotypes exhibit pronounced separation in peak symptom severity, baseline disease burden, and longitudinal dose-response patterns, with strong statistical support across multiple independent dimensions. This study illustrates how large language models can be integrated into a principled, statistically grounded pipeline for phenotypic screening from complex longitudinal data. Note that the proposed framework is disease-agnostic and offers a general approach for discovering clinically interpretable subphenotypes.