Handling and Interpreting Missing Modalities in Patient Clinical Trajectories via Autoregressive Sequence Modeling

arXiv cs.AI / 4/22/2026

💬 OpinionModels & Research

共有:

Key Points

The paper tackles the challenge of missing modalities in multimodal healthcare ML by reframing clinical diagnosis as autoregressive sequence modeling of a patient’s multimodal trajectory.
It proposes a missingness-aware contrastive pre-training objective that learns a shared latent space across modalities even when some are absent.
Using causal decoders adapted from large language models, the authors model temporal clinical signals while aiming to preserve interpretability.
Experiments on MIMIC-IV and eICU fine-tuning benchmarks show that transformer-based autoregressive sequence modeling outperforms baseline approaches.
Interpretability analysis finds that removing modalities can cause divergent model behavior across patient stays, and that the contrastive pre-training helps mitigate this issue.

Abstract

An active challenge in developing multimodal machine learning (ML) models for healthcare is handling missing modalities during training and deployment. As clinical datasets are inherently temporal and sparse in terms of modality presence, capturing the underlying predictive signal via diagnostic multimodal ML models while retaining model explainability remains an ongoing challenge. In this work, we address this by re-framing clinical diagnosis as an autoregressive sequence modeling task, utilizing causal decoders from large language models (LLMs) to model a patient's multimodal trajectory. We first introduce a missingness-aware contrastive pre-training objective that integrates multiple modalities in datasets with missingness in a shared latent space. We then show that autoregressive sequence modeling with transformer-based architectures outperforms baselines on the MIMIC-IV and eICU fine-tuning benchmarks. Finally, we use interpretability techniques to move beyond performance boosts and find that across various patient stays, removing modalities leads to divergent behavior that our contrastive pre-training mitigates. By abstracting clinical diagnosis as sequence modeling and interpreting patient stay trajectories, we develop a framework to profile and handle missing modalities while addressing the canonical desideratum of safe, transparent clinical AI.

I’m working on an AGI and human council system that could make the world better and keep checks and balances in place to prevent catastrophes. It could change the world. Really. Im trying to get ahead of the game before an AGI is developed by someone who only has their best interest in mind.

Reddit r/artificial

Deepseek V4 Flash and Non-Flash Out on HuggingFace

Reddit r/LocalLLaMA

DeepSeek V4 Flash & Pro Now out on API

Reddit r/LocalLLaMA

From "Hello World" to "Hello Agents": The Developer Keynote That Rewired Software Engineering

Dev.to

AI swarms could hijack democracy without anyone noticing

Reddit r/artificial

Handling and Interpreting Missing Modalities in Patient Clinical Trajectories via Autoregressive Sequence Modeling

Key Points

Abstract

Related Articles

I’m working on an AGI and human council system that could make the world better and keep checks and balances in place to prevent catastrophes. It could change the world. Really. Im trying to get ahead of the game before an AGI is developed by someone who only has their best interest in mind.

Deepseek V4 Flash and Non-Flash Out on HuggingFace

DeepSeek V4 Flash & Pro Now out on API

From "Hello World" to "Hello Agents": The Developer Keynote That Rewired Software Engineering

AI swarms could hijack democracy without anyone noticing

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer