Negative Binomial Variational Autoencoders for Overdispersed Latent Modeling

arXiv stat.ML / 4/9/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces NegBio-VAE, a negative-binomial variational autoencoder designed to model overdispersed spike-count data using a dispersion parameter rather than Poisson’s fixed mean-variance assumption.
  • It aims to improve biological plausibility and representational expressiveness by using discrete, count-based latent variables while retaining interpretability typical of latent-variable models.
  • The authors propose new KL-divergence estimation and reparameterization techniques to make training feasible and stable for the negative-binomial latent-variable formulation.
  • Experiments across four datasets show NegBio-VAE achieves better reconstruction and generation performance than competing single-layer VAE baselines and produces more informative latent representations for downstream tasks.
  • Extensive ablation studies validate the robustness of key components and their contribution to performance gains.

Abstract

Although artificial neural networks are often described as brain-inspired, their representations typically rely on continuous activations, such as the continuous latent variables in variational autoencoders (VAEs), which limits their biological plausibility compared to the discrete spike-based signaling in real neurons. Extensions like the Poisson VAE introduce discrete count-based latents, but their equal mean-variance assumption fails to capture overdispersion in neural spikes, leading to less expressive and informative representations. To address this, we propose NegBio-VAE, a negative-binomial latent-variable model with a dispersion parameter for flexible spike count modeling. NegBio-VAE preserves interpretability while improving representation quality and training feasibility via novel KL estimation and reparameterization. Experiments on four datasets demonstrate that NegBio-VAE consistently achieves superior reconstruction and generation performance compared to competing single-layer VAE baselines, and yields robust, informative latent representations for downstream tasks. Extensive ablation studies are performed to verify the model's robustness w.r.t. various components. Our code is available at https://github.com/co234/NegBio-VAE.