Mild Over-Parameterization Benefits Asymmetric Tensor PCA

arXiv cs.LG / 4/14/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

This work studies Asymmetric Tensor PCA (ATPCA), focusing on the trade-offs between sample complexity, computation, and memory under a limited state-memory budget.
The authors show that existing ATPCA algorithms typically need at least d^{\lceil k̄/2\rceil} state memory to recover the signal, motivating a memory-efficient approach.
They propose a matrix-parameterized method with d^2 state memory, using a novel three-phase alternating-update algorithm along with (stochastic) gradient descent-based learning.
Mild over-parameterization is shown to improve sample efficiency, achieving near-optimal d^{k̄-2} sample complexity, and to increase adaptivity, with required sample size decreasing as consecutive vectors align.
The paper claims this is the first tractable algorithm for ATPCA with memory costs independent of d (i.e., d^{k̄}-independent memory).

Abstract

Asymmetric Tensor PCA (ATPCA) is a prototypical model for studying the trade-offs between sample complexity, computation, and memory. Existing algorithms for this problem typically require at least

d^{\left\lceil\overline{k}/2\right\rceil}

state memory cost to recover the signal, where

d

is the vector dimension and

\overline{k}

is the tensor order. We focus on the setting where

\overline{k} \geq 4

is even and consider (stochastic) gradient descent-based algorithms under a limited memory budget, which permits only mild over-parameterization of the model. We propose a matrix-parameterized method (in

d^{2}

state memory cost) using a novel three-phase alternating-update algorithm to address the problem and demonstrate how mild over-parameterization facilitates learning in two key aspects: (i) it improves sample efficiency, allowing our method to achieve \emph{near-optimal}

d^{\overline{k}-2}

sample complexity in our limited memory setting; and (ii) it enhances adaptivity to problem structure, a previously unrecognized phenomenon, where the required sample size naturally decreases as consecutive vectors become more aligned, and in the symmetric limit attains

d^{\overline{k}/2}

, matching the \emph{best} known polynomial-time complexity. To our knowledge, this is the \emph{first} tractable algorithm for ATPCA with

d^{\overline{k}}

-independent memory costs.

Emerging Properties in Unified Multimodal Pretraining

Dev.to

Build a Profit-Generating AI Agent with LangChain: A Step-by-Step Tutorial

Dev.to

Open source AI is winning — but here's why I still pay $2/month for Claude API

Dev.to

AI Agents Need Real Email Infrastructure

Dev.to

Beyond the Prompt: Why AI Agents Are Hitting the Deployment Wall

Dev.to

Mild Over-Parameterization Benefits Asymmetric Tensor PCA

Key Points

Abstract

Related Articles

Emerging Properties in Unified Multimodal Pretraining

Build a Profit-Generating AI Agent with LangChain: A Step-by-Step Tutorial

Open source AI is winning — but here's why I still pay $2/month for Claude API

AI Agents Need Real Email Infrastructure

Beyond the Prompt: Why AI Agents Are Hitting the Deployment Wall

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer