A Multimodal Foundation Model of Spatial Transcriptomics and Histology for Biological Discovery and Clinical Prediction

arXiv cs.AI / 4/7/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • STORM is introduced as a multimodal foundation model that learns from 1.2M spatial transcriptomics profiles matched to H&E histology across 18 organs to bridge imaging and molecular omics.
  • The model uses a hierarchical architecture combining morphology, gene expression, and spatial context to generate robust molecular–morphological representations for spatial domain discovery.
  • STORM reportedly improves the prediction of spatial gene expression from H&E images across 11 tumor types compared with existing approaches.
  • The approach is described as platform-agnostic, with consistent performance across major spatial transcriptomics platforms (Visium, Xenium, Visium HD, CosMx).
  • In 23 independent patient cohorts (7,245 patients), STORM is claimed to significantly improve immunotherapy response prediction and prognostication beyond established biomarkers, supporting scalable clinical precision medicine use.

Abstract

Spatial transcriptomics (ST) enables gene expression mapping within anatomical context but remains costly and low-throughput. Hematoxylin and eosin (H\&E) staining offers rich morphology yet lacks molecular resolution. We present \textbf{\ours} (\textbf{S}patial \textbf{T}ranscriptomics and hist\textbf{O}logy \textbf{R}epresentation \textbf{M}odel), a foundation model trained on 1.2 million spatially resolved transcriptomic profiles with matched histology across 18 organs. Using a hierarchical architecture integrating morphological features, gene expression, and spatial context, STORM bridges imaging and omics through robust molecular--morphological representations. STORM enhances spatial domain discovery, producing biologically coherent tissue maps, and outperforms existing methods in predicting spatial gene expression from H\&E images across 11 tumor types. The model is platform-agnostic, performing consistently across Visium, Xenium, Visium HD, and CosMx. Applied to 23 independent cohorts comprising 7,245 patients, STORM significantly improves immunotherapy response prediction and prognostication over established biomarkers, providing a scalable framework for spatially informed discovery and clinical precision medicine.