Metric-Aware Principal Component Analysis (MAPCA):A Unified Framework for Scale-Invariant Representation Learning

arXiv cs.LG / 4/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes Metric-Aware Principal Component Analysis (MAPCA), a unified framework for scale-invariant representation learning formulated as a generalized eigenproblem with a metric constraint W^T M W = I.
  • By selecting the metric M, MAPCA controls representation geometry, and its beta-family M(beta)=Sigma^beta continuously interpolates between standard PCA (beta=0) and output whitening (beta=1) while monotonically improving conditioning toward isotropy.
  • Setting M to the diagonal D=diag(Sigma) yields Invariant PCA (IPCA), which the authors position as a special case within the broader MAPCA family.
  • The authors prove that scale invariance holds exactly when the metric transforms under rescaling as M_tilde = C M C, a condition met by IPCA but generally not by intermediate beta values in the beta-family.
  • MAPCA is also used to interpret and unify several self-supervised learning objectives, clarifying that W-MSE corresponds to M=Sigma^{-1} (beta=-1), which lies outside the whitening interpolation range and reverses the spectral direction relative to Barlow Twins.

Abstract

We introduce Metric-Aware Principal Component Analysis (MAPCA), a unified framework for scale-invariant representation learning based on the generalised eigenproblem max Tr(W^T Sigma W) subject to W^T M W = I, where M is a symmetric positive definite metric matrix. The choice of M determines the representation geometry. The canonical beta-family M(beta) = Sigma^beta, beta in [0,1], provides continuous spectral bias control between standard PCA (beta=0) and output whitening (beta=1), with condition number kappa(beta) = (lambda_1/lambda_p)^(1-beta) decreasing monotonically to isotropy. The diagonal metric M = D = diag(Sigma) recovers Invariant PCA (IPCA), a method rooted in Frisch (1928) diagonal regression, as a distinct member of the broader framework. We prove that scale invariance holds if and only if the metric transforms as M_tilde = CMC under rescaling C, a condition satisfied exactly by IPCA but not by the general beta-family at intermediate values. Beyond its classical interpretation, MAPCA provides a geometric language that unifies several self-supervised learning objectives. Barlow Twins and ZCA whitening correspond to beta=1 (output whitening); VICReg's variance term corresponds to the diagonal metric. A key finding is that W-MSE, despite being described as a whitening-based method, corresponds to M = Sigma^{-1} (beta = -1), outside the spectral compression range entirely and in the opposite spectral direction to Barlow Twins. This distinction between input and output whitening is invisible at the level of loss functions and becomes precise only within the MAPCA framework.