Simulating clinical interventions with a generative multimodal model of human physiology

arXiv cs.AI / 5/1/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces HealthFormer, a decoder-only transformer that generatively models individual human physiological trajectories using data from the Human Phenotype Project.
  • It tokenizes multi-visit, deeply phenotyped patient trajectories across 667 measurements spanning seven health domains, and trains the model to forecast future physiological changes.
  • Without task-specific fine-tuning, HealthFormer reportedly transfers across four independent cohorts and improves prediction for 27 of 30 disease and mortality endpoints compared with established clinical risk scores.
  • The authors demonstrate in-silico intervention simulation: in a personalized nutrition trial setting, intervention-conditioned predictions recover individual six-month biomarker changes and match published randomized trial effects across 41 comparisons.
  • The work frames HealthFormer as an early “health world model” enabling forecasting, risk stratification, and intervention-conditioned simulation via queries—supporting the concept of clinical digital twins.

Abstract

Understanding how human health changes over time, and why responses to interventions vary between individuals, remains a central challenge in medicine. Here we present HealthFormer, a decoder-only transformer that models the human physiological trajectory generatively, by training on data from the Human Phenotype Project, a multi-visit cohort of over 15,000 deeply phenotyped individuals. We tokenise each participant's health trajectory across 667 measurements spanning seven domains: blood biomarkers, body composition, sleep physiology, continuous glucose monitoring, gut microbiome, wearable-derived physiology, and behaviour and medication exposure. We train HealthFormer to forecast individual physiological trajectories across these domains, and from this single generative objective a range of clinically relevant tasks can be expressed as queries on the model. We show that, without task-specific training, HealthFormer transfers to four independent cohorts and improves prediction for 27 of 30 incident-disease and mortality endpoints, exceeding established clinical risk scores in every comparison. We further show that the model can simulate interventions in silico: in a held-out personalised-nutrition trial, intervention-conditioned predictions recover individual six-month biomarker changes (e.g., Pearson r = 0.78 for diastolic blood pressure). Across 41 randomised intervention-outcome comparisons drawn from published trials, our results show that the predicted direction of effect agrees in every case, and the predicted mean falls within the reported 95% confidence interval in 30 cases. We position HealthFormer as an initial health world model, from which forecasting, risk stratification, and intervention-conditioned simulation arise as queries, providing a basis for clinical digital twins.