AI Navigate

NeuronSpark: A Spiking Neural Network Language Model with Selective State Space Dynamics

arXiv cs.AI / 3/18/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • NeuronSpark introduces a 0.9B-parameter spiking neural network language model trained with next-token prediction and surrogate gradients, without Transformer distillation.
  • The model employs selective state-space spiking dynamics, leakage-current inter-layer communication, PonderNet adaptive timesteps, fused Triton PLIF kernels, and stabilization techniques such as residual centering, lateral-inhibition normalization, and natural-gradient compensation.
  • With a constrained pretraining budget (~1.4B tokens) and 6.5K supervised fine-tuning steps, NeuronSpark reaches a 3.6 pretraining loss and shows early multi-turn dialogue behavior after SFT.
  • The results demonstrate the feasibility of end-to-end language modeling with a pure SNN architecture at this scale, suggesting new directions for neuromorphic NLP.

Abstract

We ask whether a pure spiking backbone can learn large-scale language modeling from random initialization, without Transformer distillation. We introduce NeuronSpark, a 0.9B-parameter SNN language model trained with next-token prediction and surrogate gradients. The model combines selective state-space spiking dynamics, leakage-current inter-layer communication, PonderNet adaptive timesteps, fused Triton PLIF kernels, and stabilization techniques (residual centering, lateral-inhibition normalization, and natural-gradient compensation). Under a constrained budget (about 1.4B pretraining tokens and 6.5K SFT steps), NeuronSpark-0.9B reaches 3.6 pretraining loss and shows early multi-turn dialogue behavior after SFT. These results support the feasibility of end-to-end language modeling with a pure SNN architecture at this scale.