OpenTME: An Open Dataset of AI-powered H&E Tumor Microenvironment Profiles from TCGA

arXiv cs.CV / 4/15/2026

📰 NewsSignals & Early TrendsTools & Practical UsageModels & Research

Key Points

  • The paper introduces OpenTME, an open-access dataset containing AI-derived tumor microenvironment (TME) profiles from 3,634 TCGA H&E whole-slide images across five cancer types.
  • Each slide is processed with Atlas H&E-TME, an AI application that performs quality control, segmentation, cell detection/classification, and spatial neighborhood analysis.
  • The pipeline generates more than 4,500 quantitative cell-level readouts per slide, enabling large-scale, standardized TME characterization from routine histopathology.
  • OpenTME is released for non-commercial academic research on Hugging Face, with plans to expand the dataset over time.
  • The creators position the dataset as a resource for biomarker discovery, spatial biology research, and development of new computational TME analysis methods.

Abstract

The tumor microenvironment (TME) plays a central role in cancer progression, treatment response, and patient outcomes, yet large-scale, consistent, and quantitative TME characterization from routine hematoxylin and eosin (H&E)-stained histopathology remains scarce. We introduce OpenTME, an open-access dataset of pre-computed TME profiles derived from 3,634 H&E-stained whole-slide images across five cancer types (bladder, breast, colorectal, liver, and lung cancer) from The Cancer Genome Atlas (TCGA). All outputs were generated using Atlas H&E-TME, an AI-powered application built on the Atlas family of pathology foundation models, which performs tissue quality control, tissue segmentation, cell detection and classification, and spatial neighborhood analysis, yielding over 4,500 quantitative readouts per slide at cell-level resolution. OpenTME is available for non-commercial academic research on Hugging Face. We will continue to expand OpenTME over time and anticipate it will serve as a resource for biomarker discovery, spatial biology research, and the development of computational methods for TME analysis.