Sharp description of local minima in the loss landscape of high-dimensional two-layer ReLU neural networks

arXiv stat.ML / 4/13/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper analyzes the population loss landscape of two-layer ReLU networks in a teacher-student (realizable) setting with Gaussian inputs, focusing specifically on the structure of local minima.
It shows that local minima can be exactly represented using a low-dimensional set of summary statistics, enabling a sharper and more interpretable characterization of the landscape.
The work links the geometry of local minima to the dynamics of one-pass SGD by showing that minima correspond to attractive fixed points in the summary-statistics space.
It finds a hierarchical structure: minima are typically isolated in the well-specified regime but become connected by flat directions as width increases, making global minima more accessible and improving convergence behavior.
The authors argue that standard simplifying assumptions can miss key features of the loss landscape even for minimal neural network models.

Abstract

We study the population loss landscape of two-layer ReLU networks of the form

\sum_{k=1}^K \mathrm{ReLU}(w_k^\top x)

in a realisable teacher-student setting with Gaussian covariates. We show that local minima admit an exact low-dimensional representation in terms of summary statistics, yielding a sharp and interpretable characterisation of the landscape. We further establish a direct link with one-pass SGD: local minima correspond to attractive fixed points of the dynamics in summary statistics space. This perspective reveals a hierarchical structure of minima: they are typically isolated in the well-specified regime, but become connected by flat directions as network width increases. In this overparameterised regime, global minima become increasingly accessible, attracting the dynamics and reducing convergence to spurious solutions. Overall, our results reveal intrinsic limitations of common simplifying assumptions, which may miss essential features of the loss landscape even in minimal neural network models.

Why Fashion Trend Prediction Isn’t Enough Without Generative AI

Dev.to

Chatbot vs Voicebot: The Real Business Decision Nobody Talks About

Dev.to

วิธีใช้ AI ทำ SEO ให้เว็บติดอันดับ Google (2026)

Dev.to

Free AI Tools With No Message Limits — The Definitive List (2026)

Dev.to

Why Domain Knowledge Is Critical in Healthcare Machine Learning

Dev.to

Sharp description of local minima in the loss landscape of high-dimensional two-layer ReLU neural networks

Key Points

Abstract

Related Articles

Why Fashion Trend Prediction Isn’t Enough Without Generative AI

Chatbot vs Voicebot: The Real Business Decision Nobody Talks About

วิธีใช้ AI ทำ SEO ให้เว็บติดอันดับ Google (2026)

Free AI Tools With No Message Limits — The Definitive List (2026)

Why Domain Knowledge Is Critical in Healthcare Machine Learning

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer