Self-Supervised Foundation Model for Calcium-imaging Population Dynamics

arXiv cs.AI / 4/8/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces CalM, a self-supervised neural foundation model trained only on neuronal calcium traces to support multiple neuroscience objectives with better transferability than task-specific methods.
  • CalM uses a high-performance tokenizer that converts single-neuron traces into a shared discrete vocabulary, along with a dual-axis autoregressive transformer that models dependencies across both neural and time dimensions.
  • Experiments on a large-scale, multi-animal, multi-session calcium imaging dataset show that CalM improves neural population dynamics forecasting over strong specialized baselines after pretraining.
  • With a task-specific head, CalM also adapts effectively to behavior decoding, outperforming supervised decoding models.
  • Representation analysis indicates that CalM learns interpretable functional structures, suggesting value beyond just predictive performance, and the authors note that code will be released soon.

Abstract

Recent work suggests that large-scale, multi-animal modeling can significantly improve neural recording analysis. However, for functional calcium traces, existing approaches remain task-specific, limiting transfer across common neuroscience objectives. To address this challenge, we propose \textbf{CalM}, a self-supervised neural foundation model trained solely on neuronal calcium traces and adaptable to multiple downstream tasks, including forecasting and decoding. Our key contribution is a pretraining framework, composed of a high-performance tokenizer mapping single-neuron traces into a shared discrete vocabulary, and a dual-axis autoregressive transformer modeling dependencies along both the neural and the temporal axis. We evaluate CalM on a large-scale, multi-animal, multi-session dataset. On the neural population dynamics forecasting task, CalM outperforms strong specialized baselines after pretraining. With a task-specific head, CalM further adapts to the behavior decoding task and achieves superior results compared with supervised decoding models. Moreover, linear analyses of CalM representations reveal interpretable functional structures beyond predictive accuracy. Taken together, we propose a novel and effective self-supervised pretraining paradigm for foundation models based on calcium traces, paving the way for scalable pretraining and broad applications in functional neural analysis. Code will be released soon.