Mixed Membership sub-Gaussian Models

arXiv stat.ML / 4/27/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a new “mixed membership sub-Gaussian” model that generalizes Gaussian mixture models by allowing each observation to partially belong to multiple latent components rather than exactly one.
  • It introduces an efficient spectral algorithm to estimate each individual’s mixed-membership vector and provides theoretical error guarantees that shrink arbitrarily small under mild separation assumptions.
  • The authors claim this is the first computationally efficient estimator for a mixed-membership extension of Gaussian mixtures with a vanishing-error bound.
  • Experiments across application-style settings (e.g., where overlap is natural, such as genetics, social networks, and text mining) show the method outperforms approaches that assume hard, single-component membership.

Abstract

The Gaussian mixture model is widely used in unsupervised learning, owing to its simplicity and interpretability. However, a fundamental limitation of the classical Gaussian mixture model is that it forces each observation to belong to exactly one component. In many practical applications, such as genetics, social network analysis, and text mining, an observation may naturally belong to multiple components or exhibit partial membership in several latent components. To overcome this limitation, we propose the mixed membership sub-Gaussian model, which extends the classical Gaussian mixture framework by allowing each observation to belong to multiple components. This model inherits the interpretability of the classical Gaussian mixture model while offering greater flexibility for capturing complex overlapping structures. We develop an efficient spectral algorithm to estimate the mixed membership of each individual observation, and under mild separation conditions on the component centres, we prove that the estimation error of the per-individual membership vector can be made arbitrarily small with high probability. To our knowledge, this is the first work to provide a computationally efficient estimator with such a vanishing-error guarantee for a mixed-membership extension of the Gaussian mixture model. Extensive experimental studies demonstrate that our method outperforms existing approaches that ignore mixed memberships.