Donor-Aware scRNA-seq Benchmarks for IBD Classification
arXiv stat.ML / 5/6/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that scRNA-seq disease classification for IBD must use donor-aware cross-validation, because random cell-splitting creates pseudoreplication and can overstate performance.
- It introduces a donor-aware benchmark across two independent IBD cohorts (SCP259 for ulcerative colitis and Kong 2023 for Crohn’s disease) comparing three feature representations: CLR composition, GatedStructuralCFN dependency embeddings, and scVI latent embeddings.
- Results show strong donor-aware performance, with CLR and CFN reaching high AUROC on SCP259 and CFN outperforming linear CLR in the Kong cohort’s colon region, while terminal ileum performance favors linear models.
- The study finds that cross-dataset transfer is asymmetric (CD→UC works with AUC 0.833, UC→CD is near chance) and that compartment-stratified features improve CFN edge stability by reducing spurious instability from global composition.
- It provides code for the benchmark (GitHub link) and concludes that compartment-aware feature construction is key for both predictive accuracy and interpretability of model structure.
Related Articles

Top 10 Free AI Tools for Students in 2026: The Ultimate Study Guide
Dev.to

AI as Your Contingency Co-Pilot: Automating Wedding Day 'What-Ifs'
Dev.to

Google AI Releases Multi-Token Prediction (MTP) Drafters for Gemma 4: Delivering Up to 3x Faster Inference Without Quality Loss
MarkTechPost
When Claude Hallucinates in Court: The Latham & Watkins Incident and What It Means for Attorney Liability
MarkTechPost
Solidity LM surpasses Opus
Reddit r/LocalLLaMA