Channel Adaptation for EEG Foundation Models: A Systematic Benchmark Across Architectures, Tasks, and Training Regimes

arXiv cs.LG / 4/28/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper benchmarks four channel adaptation methods for EEG foundation models by testing them across multiple model architectures, five downstream tasks, and two training regimes with repeated seeds.
  • Rigid-montage models (BENDR, Neuro-GPT) typically need external adaptation, while flexible models (EEGPT, CBraMod) can perform as well or better when fine-tuned but may still benefit from external methods when the encoder is frozen.
  • The authors identify a probe-SFT asymmetry: using external adaptation can lead to severe negative transfer during fine-tuning for flexible models.
  • The best adaptation approach depends on the underlying architecture (e.g., Conv1d for BENDR; SSI/Riemannian for Neuro-GPT; source-space decomposition for depression detection), with a notable result that a 5M-parameter CBraMod can outperform much larger models on most datasets.
  • Overall, the findings suggest that compact EEG-specific architectures can achieve strong performance, and that adaptation strategy selection should be guided by architecture and deployment constraints.

Abstract

Scaling EEG foundation models requires pooling data across heterogeneous electrode montages, a prerequisite both for larger pretraining corpora and for downstream deployment. We present the first systematic comparison of four channel adaptation methods (Conv1d projection, spherical spline interpolation (SSI), source-space decomposition, and Riemannian re-centering) across five pretrained EEG foundation models (5M--157M parameters), five downstream tasks, and two training regimes with 10--15 random seeds each. We find that rigid-montage models (BENDR, Neuro-GPT) require external adaptation, while flexible models (EEGPT, CBraMod) match or exceed it natively when fine-tuned but benefit from external methods under frozen-encoder deployment. A probe-SFT asymmetry exists: external adaptation can cause severe negative transfer during fine-tuning of flexible models. The optimal method is architecture-dependent (Conv1d for BENDR, SSI/Riemannian for Neuro-GPT, source-space decomposition for depression detection), and 5M-parameter CBraMod outperforms models up to 31\times larger on 4/5 datasets, consistent with independent findings that compact EEG-specific architectures can match larger models.