Learning Curves and Benign Overfitting of Spectral Algorithms in Large Dimensions

arXiv stat.ML / 4/28/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper investigates learning curves and “benign overfitting” for spectral algorithms in large dimensions, focusing on the regime where the sample size and dimension are comparable (n ≍ d^γ).
For inner-product kernels on the sphere, it derives a sharp asymptotic description of the excess risk over the entire regularization path under different source conditions s that quantify regression-function smoothness.
The study finds the learning curve is not simply U-shaped; it splits into three regimes—over-regularized, under-regularized, and interpolation—each with distinct behavior.
Benign overfitting is shown to occur broadly across both under-regularized and interpolation regimes when the smoothness parameter s is positive but below a critical threshold, and the kernel-learning curve is linked to an associated sequence model in the sufficiently regularized regime.
The analysis is extended to large-dimensional kernel ridge regression (KRR) on general domains in R^d for kernels whose low-degree eigenspaces satisfy spectral-scaling and hypercontractivity conditions.

Abstract

Existing large-dimensional theory for spectral algorithms resolves either the optimally tuned point or the interpolation limit, but leaves the under-regularized regime unexplored. We study the learning curve and benign overfitting of spectral algorithms in the large-dimensional setting where the sample size and dimension are of comparable order, i.e.,

n \asymp d^{\gamma}

for some

\gamma>0

. We first consider inner-product kernels on the sphere

\mathbb{S}^{d-1}

and establish a sharp asymptotic characterization of the excess risk across the full regularization path under various source conditions

s \geq 0

, where

s

measures the relative smoothness of the regression function. Our results reveal that the learning curve is not simply U-shaped but instead consists of three distinct regimes: over-regularized, under-regularized, and interpolation regimes. This characterization allows us to fully capture the benign overfitting phenomenon, demonstrating that benign overfitting arises consistently across both the under-regularized and interpolation regimes whenever

s

is positive but no larger than a critical threshold. We further show that, in the sufficiently regularized regime, the kernel learning curve is recovered by an associated sequence model. Finally, we extend the learning-curve analysis to large-dimensional KRR for a class of kernels on general domains in

\mathbb{R}^d

whose low-degree eigenspaces satisfy spectral-scaling and hyper-contractivity conditions.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Same Agent, Different Risk | How Microsoft 365 Copilot Grounding Changes the Security Model | Rahsi Framework™

Dev.to

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System

Dev.to

How We Built an Ambient AI Clinical Documentation Pipeline (and Saved Doctors 8+ Hours a Week)

Dev.to

🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

Dev.to

Learning Curves and Benign Overfitting of Spectral Algorithms in Large Dimensions

Key Points

Abstract

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Same Agent, Different Risk | How Microsoft 365 Copilot Grounding Changes the Security Model | Rahsi Framework™

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System

How We Built an Ambient AI Clinical Documentation Pipeline (and Saved Doctors 8+ Hours a Week)

🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer