Evaluating TabPFN for Mild Cognitive Impairment to Alzheimer's Disease Conversion in Data Limited Settings

arXiv cs.AI / 5/1/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The study evaluates TabPFN, a tabular foundation-model approach, for predicting 3-year conversion from Mild Cognitive Impairment (MCI) to Alzheimer’s Disease (AD) using the TADPOLE dataset derived from ADNI.
  • It compares TabPFN with traditional ML models (XGBoost, Random Forest, LightGBM, and Logistic Regression) using multimodal biomarker features including demographics, APOE4, MRI volumes, CSF markers, and PET imaging.
  • Across training set sizes ranging from N=50 to N=1000, TabPFN achieved the top performance with AUC=0.892, outperforming LightGBM (AUC=0.860).
  • In low-data regimes (e.g., N=50), TabPFN maintained strong AUC while the traditional models degraded more substantially.
  • The results suggest that foundation-model-style methods can be promising for medical disease prediction tasks where longitudinal data are limited, such as in Alzheimer’s conversion risk modeling.

Abstract

Accurate prediction of conversion from Mild Cognitive Impairment (MCI) to Alzheimers Diseases (AD) is essential for early intervention, however, developing reliable conversion predictive models is difficult to develop due to limited longitudinal data availability We evaluate TabPFN (Tabular Pre-Trained Foundation Network) against traditional machine learning methods for predicting 3 year MCI to AD conversion using the TADPOLE dataset derived from ADNI. Using multimodal biomarker features extracted from demographics, APOE4, MRI volumes, CSF markers, and PET imaging, we conducted an experimental comparison across varying training set sizes (N=50 to 1000) and models including XGBoost, Random Forest, LightGBM, and Logistic Regression. TabPFN achieved one the highest performance (AUC=0.892), outperforming LightGBM (AUC=0.860) and demonstrating advantages in low data settings. At N=50 training samples, TabPFN maintained strong AUC while the traditional machine learning models struggles at small training samples. These findings demonstrate that foundation models are promising for disease prediction in data limited scenarios, such as Alzheimers diseases.