Improving clinical interpretability of linear neuroimaging models through feature whitening

arXiv cs.LG / 4/23/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • Linear neuroimaging models are useful for biomarker discovery, but their weight interpretations are often not clinically meaningful because correlated brain regions cause shared (not region-specific) signals to be mixed into the learned weights.
  • The paper proposes a whitening method that uses prior neuroanatomical knowledge to decorrelate groups of brain regions with known shared variance, aiming to disentangle overlapping information across correlated measures.
  • It also introduces a regularized whitening variant that enables controlled tuning of how strongly the features are decorrelated.
  • Experiments on ROI features for two psychiatric classification tasks (bipolar disorder vs. controls, schizophrenia vs. controls) show improved interpretability of linear model weights while maintaining predictive performance.
  • Unlike PCA/ICA whitening used for dimensionality reduction, the method preserves the full input signal and is designed specifically for feature interpretation rather than feature selection.

Abstract

Linear models are widely used in computational neuroimaging to identify biomarkers associated with brain pathologies. However, interpreting the learned weights remains challenging, as they do not always yield clinically meaningful insights. This difficulty arises in part from the inherent correlation between brain regions, which causes linear weights to reflect shared rather than region-specific contributions. In particular, some groups of regions, including homologous structures in the left and right hemispheres, are known to exhibit strong anatomical correlations. In this work, we leverage this prior neuroanatomical knowledge to introduce a whitening approach applied to groups of regions with known shared variance, designed to disentangle overlapping information across correlated brain measures. We additionally propose a regularized variant that allows controlled tuning of the degree of decorrelation. We evaluate this method using region-of-interest features in two psychiatric classification tasks, distinguishing individuals with bipolar disorder or schizophrenia from healthy controls. Importantly, unlike PCA or ICA which use whitening as a dimensionality reduction step, our approach decorrelates anatomically informed pairs of neuroanatomical regions while retaining the full input signal, making it specifically suited for feature interpretation rather than feature selection. Our findings demonstrate that whitening improves the interpretability of model weights while preserving predictive performance, providing a robust framework for linking linear model outputs to neurobiological mechanisms.