MorphDistill: Distilling Unified Morphological Knowledge from Pathology Foundation Models for Colorectal Cancer Survival Prediction

arXiv cs.CV / 4/9/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces MorphDistill, a two-stage framework that distills organ-specific morphological knowledge from multiple pathology foundation models into a compact CRC-specific encoder for survival prediction.
  • Stage I uses dimension-agnostic multi-teacher relational distillation with supervised contrastive regularization to preserve inter-sample relationships from ten foundation models without requiring explicit feature alignment.
  • Stage II extracts patch-level features from whole-slide images and aggregates them with attention-based multiple instance learning to predict five-year survival.
  • On the Alliance/CALGB 89803 cohort, MorphDistill reports an AUC of 0.68, about an 8% relative improvement over the best baseline, and it outperforms baselines on C-index and hazard ratio.
  • On an external TCGA cohort, the method maintains performance (C-index 0.628), indicating cross-dataset generalization and robustness across clinical subgroups while noting the need for further validation.

Abstract

Background: Colorectal cancer (CRC) remains a leading cause of cancer-related mortality worldwide. Accurate survival prediction is essential for treatment stratification, yet existing pathology foundation models often overlook organ-specific features critical for CRC prognostication. Methods: We propose MorphDistill, a two-stage framework that distills complementary knowledge from multiple pathology foundation models into a compact CRC-specific encoder. In Stage I, a student encoder is trained using dimension-agnostic multi-teacher relational distillation with supervised contrastive regularization on large-scale colorectal datasets. This preserves inter-sample relationships from ten foundation models without explicit feature alignment. In Stage II, the encoder extracts patch-level features from whole-slide images, which are aggregated via attention-based multiple instance learning to predict five-year survival. Results: On the Alliance/CALGB 89803 cohort (n=424, stage III CRC), MorphDistill achieves an AUC of 0.68 (SD 0.08), an approximately 8% relative improvement over the strongest baseline (AUC 0.63). It also attains a C-index of 0.661 and a hazard ratio of 2.52 (95% CI: 1.73-3.65), outperforming all baselines. On an external TCGA cohort (n=562), it achieves a C-index of 0.628, demonstrating strong generalization across datasets and robustness across clinical subgroups. Conclusion: MorphDistill enables task-specific representation learning by integrating knowledge from multiple foundation models into a unified encoder. This approach provides an efficient strategy for prognostic modeling in computational pathology, with potential for broader oncology applications. Further validation across additional cohorts and disease stages is warranted.