Identifiability of Potentially Degenerate Gaussian Mixture Models With Piecewise Affine Mixing

arXiv cs.AI / 4/16/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies causal representation learning where latent variables are drawn from a potentially degenerate Gaussian mixture distribution but are only observed after passing through a piecewise affine mixing transformation.
  • It develops progressively stronger theoretical identifiability guarantees, addressing cases where densities become ill-defined due to mixture degeneracy.
  • Identifiability up to permutation and scaling is achieved by leveraging sparsity regularization applied to the learned representation.
  • Using the theory, the authors introduce a two-stage estimation method that enforces both sparsity and Gaussianity to recover latent variables.
  • Experiments on synthetic and image datasets suggest the proposed approach can effectively recover ground-truth latent factors despite the challenging degenerate mixture setting.

Abstract

Causal representation learning (CRL) aims to identify the underlying latent variables from high-dimensional observations, even when variables are dependent with each other. We study this problem for latent variables that follow a potentially degenerate Gaussian mixture distribution and that are only observed through the transformation via a piecewise affine mixing function. We provide a series of progressively stronger identifiability results for this challenging setting in which the probability density functions are ill-defined because of the potential degeneracy. For identifiability up to permutation and scaling, we leverage a sparsity regularization on the learned representation. Based on our theoretical results, we propose a two-stage method to estimate the latent variables by enforcing sparsity and Gaussianity in the learned representations. Experiments on synthetic and image data highlight our method's effectiveness in recovering the ground-truth latent variables.