Free Decompression with Algebraic Spectral Curves

arXiv stat.ML / 5/6/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a practical limitation in deep learning theory methods that rely on random matrix theory: computing spectral information is often restricted to small matrix sizes, motivating extrapolation to larger, more realistic models.
  • It introduces a more general Free Decompression (FD) approach by using algebraic spectral curve theory, assuming that the Stieltjes transform of the spectral density satisfies an algebraic relation.
  • The framework reformulates FD as an evolution along spectral curves that can be integrated, making the method applicable beyond earlier strong assumptions.
  • It supports challenging, realistic spectral shapes including multi-modal bulks, multi-scale spectra, and spectral atoms, which are typical in real data and common ML model components.
  • The authors validate the method on spectral matrices relevant to modern ML, including Hessian/activation-related matrices for neural networks and components from large-scale diffusion models.

Abstract

Tools from random matrix theory have become central to deep learning theory, using spectral information to provide mechanisms for modeling generalization, robustness, scaling, and failure modes. While often capable of modeling empirical behavior, practical computations are limited by matrix size, often imposing a restriction to models that are too small to be realistic. This motivates the inference of properties of larger models from the behavior of smaller ones. Free decompression (FD) is a recently proposed method for extrapolating spectral information across matrix sizes, but its utility is currently limited by strong assumptions that preclude its implementation on more realistic machine learning (ML) models. We use algebraic spectral curve theory to provide a general FD methodology for spectral densities whose Stieltjes transform satisfies an algebraic relation, a modeling assumption that is more likely to hold in practice. This recasts FD as an evolution along spectral curves which can be readily integrated. Our framework enables the expansion of spectral densities that have multiple or multi-modal bulks, that exist at multiple scales, and that contain atoms, all characteristic of real-world data and popular ML models. We demonstrate the efficacy of our framework on models of interest in modern ML, including Hessian and activation matrices associated with neural networks and large-scale diffusion models.