Mild Over-Parameterization Benefits Asymmetric Tensor PCA

arXiv cs.LG / 4/14/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • This work studies Asymmetric Tensor PCA (ATPCA), focusing on the trade-offs between sample complexity, computation, and memory under a limited state-memory budget.
  • The authors show that existing ATPCA algorithms typically need at least d^{\lceil k̄/2\rceil} state memory to recover the signal, motivating a memory-efficient approach.
  • They propose a matrix-parameterized method with d^2 state memory, using a novel three-phase alternating-update algorithm along with (stochastic) gradient descent-based learning.
  • Mild over-parameterization is shown to improve sample efficiency, achieving near-optimal d^{k̄-2} sample complexity, and to increase adaptivity, with required sample size decreasing as consecutive vectors align.
  • The paper claims this is the first tractable algorithm for ATPCA with memory costs independent of d (i.e., d^{k̄}-independent memory).

Abstract

Asymmetric Tensor PCA (ATPCA) is a prototypical model for studying the trade-offs between sample complexity, computation, and memory. Existing algorithms for this problem typically require at least d^{\left\lceil\overline{k}/2\right\rceil} state memory cost to recover the signal, where d is the vector dimension and \overline{k} is the tensor order. We focus on the setting where \overline{k} \geq 4 is even and consider (stochastic) gradient descent-based algorithms under a limited memory budget, which permits only mild over-parameterization of the model. We propose a matrix-parameterized method (in d^{2} state memory cost) using a novel three-phase alternating-update algorithm to address the problem and demonstrate how mild over-parameterization facilitates learning in two key aspects: (i) it improves sample efficiency, allowing our method to achieve \emph{near-optimal} d^{\overline{k}-2} sample complexity in our limited memory setting; and (ii) it enhances adaptivity to problem structure, a previously unrecognized phenomenon, where the required sample size naturally decreases as consecutive vectors become more aligned, and in the symmetric limit attains d^{\overline{k}/2}, matching the \emph{best} known polynomial-time complexity. To our knowledge, this is the \emph{first} tractable algorithm for ATPCA with d^{\overline{k}}-independent memory costs.