High-Dimensional Private Linear Regression with Optimal Rates

arXiv stat.ML / 4/28/2026

💬 OpinionModels & Research

共有:

Key Points

The paper studies differentially private (DP) linear regression in the high-dimensional random-data regime where the dimension-to-sample ratio satisfies d/n → γ.
It analyzes a class of one-pass DP gradient descent (DP-GD) algorithms under (ρ²/2)-zero concentrated DP, deriving a deterministic equivalent of the algorithm trajectory via an ODE system.
Using this framework, the authors quantify how setting the gradient clipping constant smaller than the typical per-sample gradient norm (an approach that often improves empirical performance) affects convergence and risk.
For well-conditioned data, they show DP-GD can achieve non-asymptotic risk O(γ + γ²/ρ²) with appropriately chosen clipping constant and learning rate, and they prove this rate is minimax optimal.
For ill-conditioned data with power-law covariance spectra, they derive a different (power-like) scaling of the risk in γ and show how the scaling exponent changes with the privacy parameter ρ, also highlighting benefits from aggressive clipping and decaying learning-rate schedules.

Abstract

Differentially private (DP) linear regression has received significant attention in the recent theoretical literature, with several approaches proposed to improve error rates. Our work considers the popular high-dimensional regime with random data, where the number of training samples

n

and the input dimension

d

grow at a proportional rate

d / n \to \gamma

, and it studies a family of one-pass DP gradient descent (DP-GD) algorithms satisfying

\rho^2 / 2

zero concentrated DP. In this setting, we establish a deterministic equivalent for the DP-GD trajectory in terms of a system of ordinary differential equations. This allows to analyze the effect of gradient clipping constants that are smaller than the typical norm of the per-sample gradients - a setup shown to improve performance in practice. For well-conditioned data, we show that DP-GD, upon properly choosing clipping constant and learning rate, achieves the non-asymptotic risk of

O(\gamma + \gamma^2 / \rho^2)

, and we establish that this rate is minimax optimal. Then, we consider the ill-conditioned case where the data covariance spectrum follows a power-law distribution, and we show that the risk displays a power-like scaling law in

\gamma

, highlighting the change in the exponent as a function of the privacy parameter

\rho

. Overall, our analysis demonstrates the benefits of practical algorithmic design choices, including aggressive gradient clipping and decaying learning rate schedules.

Behind the Scenes of a Self-Evolving AI: The Architecture of Tian AI

Dev.to

Abliterlitics: Benchmarks and Tensor Comparison for Heretic, Abliterlix, Huiui, HauhauCS for GLM 4.7 Flash

Reddit r/LocalLLaMA

Record $1.1B Seed Funding for Reinforcement Learning Startup

AI Business

The One Substrate Failure Behind Every AI System in 2026

Reddit r/artificial

Into the Omniverse: Manufacturing’s Simulation-First Era Has Arrived

Nvidia AI Blog

High-Dimensional Private Linear Regression with Optimal Rates

Key Points

Abstract

Related Articles

Behind the Scenes of a Self-Evolving AI: The Architecture of Tian AI

Abliterlitics: Benchmarks and Tensor Comparison for Heretic, Abliterlix, Huiui, HauhauCS for GLM 4.7 Flash

Record $1.1B Seed Funding for Reinforcement Learning Startup

The One Substrate Failure Behind Every AI System in 2026

Into the Omniverse: Manufacturing’s Simulation-First Era Has Arrived

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer