On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains

arXiv stat.ML / 4/8/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper presents a strategy for determining eigenvalue decay rates (EDR) for a broad class of kernel functions defined on general domains rather than only on spheres.
  • It covers kernels including neural tangent kernels (NTK) for wide neural networks with varying depths and activation functions, extending prior theoretical results beyond ∕S^d.
  • The authors show that training dynamics of wide neural networks can be uniformly approximated by NTK regression on general domains.
  • They further analyze minimax optimality under a target function assumption (f in an interpolation space related to the NTK RKHS) and argue that overfitted neural networks may generalize poorly.
  • The work suggests the proposed EDR approach could be of broader independent interest for theoretical kernel/learning analysis.

Abstract

In this paper, we provide a strategy to determine the eigenvalue decay rate (EDR) of a large class of kernel functions defined on a general domain rather than \mathbb S^{d}. This class of kernel functions include but are not limited to the neural tangent kernel associated with neural networks with different depths and various activation functions. After proving that the dynamics of training the wide neural networks uniformly approximated that of the neural tangent kernel regression on general domains, we can further illustrate the minimax optimality of the wide neural network provided that the underground truth function f\in [\mathcal H_{\mathrm{NTK}}]^{s}, an interpolation space associated with the RKHS \mathcal{H}_{\mathrm{NTK}} of NTK. We also showed that the overfitted neural network can not generalize well. We believe our approach for determining the EDR of kernels might be also of independent interests.