AI Navigate

Discovery of a Hematopoietic Manifold in scGPT Yields a Method for Extracting Performant Algorithms from Biological Foundation Model Internals

arXiv cs.LG / 3/12/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper reports the discovery and extraction of a compact hematopoietic algorithm from the single-cell foundation model scGPT, achieved via mechanistic interpretability.
  • It demonstrates that scGPT internally encodes a hematopoietic manifold with developmental branch structure, validated on a non-overlapping Tabula Sapiens panel and transferable to an independent multi-donor immune panel.
  • The authors present a general three-stage extraction method—direct operator export from frozen attention weights, a lightweight adaptor, and a task-specific readout—that yields a standalone algorithm without retraining on the target dataset.
  • In extensive benchmarks against scVI, Palantir, DPT, CellTypist, PCA, and baselines, the extracted head achieves superior pseudotime-depth ordering, top endpoints (CD4/CD8 AUROC 0.867, mono/macro AUROC 0.951), is 34.5x faster with ~1000x fewer trainable parameters, and can be compressed from three attention heads to a single head and then to a rank-64 surrogate while preserving performance.

Abstract

We report the discovery and extraction of a compact hematopoietic algorithm from the single-cell foundation model scGPT, to our knowledge the first biologically useful, competitive algorithm extracted from a foundation model via mechanistic interpretability. We show that scGPT internally encodes a compact hematopoietic manifold with significant developmental branch structure, validated on a strict non-overlap Tabula Sapiens external panel and confirmed via frozen-head zero-shot transfer to an independent multi-donor immune panel. To isolate this geometry, we introduce a general three-stage extraction method consisting of direct operator export from frozen attention weights, a lightweight learned adaptor, and a task-specific readout, producing a standalone algorithm without target-dataset retraining. In 88-split donor-holdout benchmarks against scVI, Palantir, DPT, CellTypist, PCA, and raw-expression baselines, the extracted algorithm achieves the strongest pseudotime-depth ordering and leads on key subtype endpoints (CD4/CD8 AUROC 0.867, mono/macro AUROC 0.951). Compared to standard probing of frozen scGPT embeddings with a 3-layer MLP, the extracted head is BH-significantly better on 6/8 classification endpoints while completing a full 12-split evaluation campaign 34.5x faster with approximately 1000x fewer trainable parameters. The exported operator compresses from three pooled attention heads to a single head without statistically significant loss, and further to a rank-64 surrogate. Mechanistic interpretability of the compact operator reveals a concentrated four-factor core explaining 66.2% of ablation impact, with factors resolving into explicit T/lymphoid, B/plasma, granulocytic, and monocyte/macrophage gene programs. A supplementary second-manifold validation (intercellular communication geometry) confirms that the extraction method generalizes beyond hematopoiesis.