Representation Before Training: A Fixed-Budget Benchmark for Generative Medical Event Models
arXiv cs.LG / 4/21/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes a fixed-budget, one-epoch pretraining benchmark to isolate how input representation choices affect downstream performance in generative medical event models.
- Using 28 matched transformer models trained on MIMIC-IV and evaluated across 30 clinical outcomes, the study systematically tests representation variants such as quantization granularity, reference-range anchoring, and code-value fusion.
- Code-value fused tokenization significantly improves mortality AUROC (0.891→0.915), hospital length-of-stay AUROC (0.763→0.788), and increases mean regression Spearman rho (0.414→0.494), with statistically significant results.
- For temporal encoding, simple event-order and admission-relative RoPE approaches match or outperform time tokens on average while reducing sequence length by 11%.
- CLIF remapping for lab/vital codes preserves downstream performance in the authors’ single-site setting and produces a smaller, more clinically interpretable token set intended to support multi-site compatibility.


