Collapse or Preserve: Data-Dependent Temporal Aggregation for Spiking Neural Network Acceleration

arXiv cs.LG / 3/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

Spike sparsity does not yield performance gains on Apple M3 Max because SIMD units cannot exploit fine-grained, unstructured i.i.d. spikes.
They introduce Temporal Aggregated Convolution (TAC) to pre-aggregate K spike frames, reducing the number of convolution calls by a factor of T/K and achieving up to 13.8x speedups with small accuracy changes on MNIST and Fashion-MNIST for rate-coded data.
For event-based data where temporal information matters, TAC's temporal collapse hurts accuracy, so they propose TAC-TP (Temporal Preservation) which shares outputs across K LIF steps to preserve temporal resolution, achieving 95.1% on DVS128-Gesture with 50% fewer conv calls.
The study concludes that the optimal temporal aggregation is data-dependent and demonstrates hardware-agnostic speedups (including 11x on NVIDIA V100) with open-source mlx-snn operators.

Abstract

Spike sparsity is widely believed to enable efficient spiking neural network (SNN) inference on GPU hardware. We demonstrate this is an illusion: five distinct sparse computation strategies on Apple M3 Max all fail to outperform dense convolution, because SIMD architectures cannot exploit the fine-grained, unstructured sparsity of i.i.d. binary spikes. Instead, we propose Temporal Aggregated Convolution (TAC), which exploits convolution linearity to pre-aggregate

K

spike frames before a single convolution call, reducing

T

calls to

T/K

. On rate-coded data, TAC achieves 13.8times speedup with +1.6% accuracy on MNIST and +5.4% on Fashion-MNIST -- a simultaneous improvement in both speed and accuracy. However, on event-based data where the temporal dimension carries genuine motion information, TAC's temporal collapse is harmful. We therefore introduce TAC-TP (Temporal Preservation), which shares each group's convolution output across K independent LIF steps, preserving full temporal resolution for downstream layers. On DVS128-Gesture, TAC-TP achieves 95.1% accuracy (vs. 96.3% baseline) with 50% fewer convolution calls, while standard TAC drops to 91.3%. Our key finding is that the optimal temporal aggregation strategy is data-dependent: collapse the temporal dimension for rate-coded data (noise reduction) but preserve it for event data (information retention). Speedup is hardware-agnostic: TAC achieves 11.0times on NVIDIA V100, confirming the mechanism transfers across GPU architectures. All operators in the mlx-snn library are open source.

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

Dev.to

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

Dev.to

The Research That Doesn't Exist

Dev.to

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

TechCrunch

Krish Naik: AI Learning Path For 2026- Data Science, Generative and Agentic AI Roadmap

Dev.to

Collapse or Preserve: Data-Dependent Temporal Aggregation for Spiking Neural Network Acceleration

Key Points

Abstract

Related Articles

Hey dev.to community – sharing my journey with Prompt Builder, Insta Posts, and practical SEO

How to Build Passive Income with AI in 2026: A Developer's Practical Guide

The Research That Doesn't Exist

Jeff Bezos reportedly wants $100 billion to buy and transform old manufacturing firms with AI

Krish Naik: AI Learning Path For 2026- Data Science, Generative and Agentic AI Roadmap

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer