Learning Dynamic Representations and Policies from Multimodal Clinical Time-Series with Informative Missingness

arXiv cs.LG / 4/24/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a key gap in multimodal clinical time-series modeling by explicitly using “informative missingness,” where which data are observed depends on latent patient conditions.
  • It proposes a framework that jointly learns multimodal patient representations from structured measurements and clinical notes while modeling the observation patterns, then updates a latent patient state via Bayesian filtering.
  • The learned latent state is used for two downstream tasks: offline treatment policy learning and patient outcome prediction.
  • Experiments on ICU sepsis cohorts using MIMIC-III, MIMIC-IV, and eICU show improved performance, including higher FQE (0.679 vs 0.528) for treatment policy learning and AUROC of 0.886 for post–72-hour mortality prediction on MIMIC-III.
  • The results suggest that incorporating the missing-data-generating process can materially improve both decision-making and prognostic modeling from sparse, multimodal EHR data.

Abstract

Multimodal clinical records contain structured measurements and clinical notes recorded over time, offering rich temporal information about the evolution of patient health. Yet these observations are sparse, and whether they are recorded depends on the patient's latent condition. Observation patterns also differ across modalities, as structured measurements and clinical notes arise under distinct recording processes. While prior work has developed methods that accommodate missingness in clinical time series, how to extract and use the information carried by the observation process itself remains underexplored. We therefore propose a patient representation learning framework for multimodal clinical time series that explicitly leverages informative missingness. The framework combines (1) a multimodal encoder that captures signals from structured and textual data together with their observation patterns, (2) a Bayesian filtering module that updates a latent patient state over time from observed multimodal signals, and (3) downstream modules for offline treatment policy learning and patient outcome prediction based on the learned patient state. We evaluate the framework on ICU sepsis cohorts from MIMIC-III, MIMIC-IV, and eICU. It improves both offline treatment policy learning and adverse outcome prediction, achieving FQE 0.679 versus 0.528 for clinician behavior and AUROC 0.886 for post-72-hour mortality prediction on MIMIC-III.