Heavy-Tailed Class-Conditional Priors for Long-Tailed Generative Modeling

arXiv stat.ML / 4/16/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper finds that VAEs trained with global priors reflecting an imbalanced empirical class distribution can cause tail classes to be underrepresented in the latent space.
  • It proposes C-$t^3$VAE, which replaces a single global prior with a per-class heavy-tailed Student’s t joint prior over latent and output variables to reduce latent geometric bias.
  • The authors derive a closed-form training objective based on the $\gamma$-power divergence and use an equal-weight latent mixture to enable class-balanced generation.
  • Experiments on SVHN-LT, CIFAR100-LT, and CelebA show consistently lower FID than t^3VAE and Gaussian-based VAE baselines under severe class imbalance, with strong per-class F1 improvements.
  • They report a practical mild-imbalance threshold (\rho < 5) where Gaussian baselines can stay competitive, but demonstrate clear gains for \rho \ge 5 in class-balanced generation and mode coverage.

Abstract

Variational Autoencoders (VAEs) with global priors trained under an imbalanced empirical class distribution can lead to underrepresentation of tail classes in the latent space. While t^3VAE improves robustness via heavy-tailed Student's t-distribution priors, its single global prior still allocates mass proportionally to class frequency. We address this latent geometric bias by introducing C-t^3VAE, which assigns a per-class Student's t joint prior over latent and output variables. This design promotes uniform prior mass across class-conditioned components. To optimize our model we derive a closed-form objective from the \gamma-power divergence, and we introduce an equal-weight latent mixture for class-balanced generation. On SVHN-LT, CIFAR100-LT, and CelebA datasets, C-t^3VAE consistently attains lower FID scores than t^3VAE and Gaussian-based VAE baselines under severe class imbalance while remaining competitive in balanced or mildly imbalanced settings. In per-class F1 evaluations, our model outperforms the conditional Gaussian VAE across highly imbalanced settings. Moreover, we identify the mild imbalance threshold \rho < 5, for which Gaussian-based models remain competitive. However, for \rho \geq 5 our approach yields improved class-balanced generation and mode coverage.