Scaling Recurrence-aware Foundation Models for Clinical Records via Next-Visit Prediction

arXiv cs.LG / 3/26/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces RAVEN, a recurrence-aware generative pretraining approach for sequential electronic health record (EHR) data that predicts a patient’s next visit by autoregressively generating tokenized clinical events conditioned on history.
  • Using data from over one million individuals, the method adds regularization for recurring events and calls out an evaluation pitfall where repeated event tokens can artificially inflate metrics if new onsets are not distinguished from later occurrences.
  • The authors study scaling in a data-constrained, compute-saturated regime and find that increasing model size alone is not effective unless paired with increases in data volume.
  • In zero-shot disease incidence forecasting, RAVEN is shown to match or rival fully fine-tuned representation-based Transformer models and to outperform simulation-based next-token approaches.
  • Without further parameter updates, RAVEN also demonstrates cross-cohort generalization under lossy clinical code mappings and incomplete feature coverage, suggesting robustness to real-world clinical data variation.

Abstract

While large-scale pretraining has revolutionized language modeling, its potential remains underexplored in healthcare with structured electronic health records (EHRs). We present RAVEN, a novel generative pretraining strategy for sequential EHR data based on Recurrence-Aware next-Visit EveNt prediction. Leveraging a dataset of over one million unique individuals, our model learns to autoregressively generate tokenized clinical events for the next visit conditioned on patient history. We introduce regularization on predicting repeated events and highlight a key pitfall in EHR-based foundation model evaluations: repeated event tokens can inflate performance metrics when new onsets are not distinguished from subsequent occurrences. Furthermore, we empirically investigate the scaling behaviors in a data-constrained, compute-saturated regime, showing that simply increasing model size is suboptimal without commensurate increases in data volume. We evaluate our model via zero-shot prediction for forecasting the incidence of a diverse set of diseases, where it rivals fully fine-tuned representation-based Transformer models and outperforms widely used simulation-based next-token approaches. Finally, without additional parameter updates, we show that RAVEN can generalize to an external patient cohort under lossy clinical code mappings and feature coverage gaps.