Lipschitz bounds for integral kernels

arXiv stat.ML / 4/6/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies when and how feature maps induced by integral (positive definite) kernels are Lipschitz continuous, focusing on explicit formulas for the Lipschitz constants under differentiability assumptions.
It provides sufficient conditions for Lipschitz continuity, along with a condition showing when the feature map can fail to be Lipschitz continuous, and applies these results to several kernel families.
For infinite-width two-layer neural networks with isotropic Gaussian weights, it expresses the kernel’s Lipschitz constant as a supremum of a two-dimensional integral, yielding explicit characterizations for the Gaussian kernel and the ReLU random neural network kernel.
For continuous shift-invariant kernels (Gaussian, Laplace, Matérn), the work proves that Lipschitz continuity holds if and only if the weight distribution has a finite second-order moment, and derives the corresponding Lipschitz constant.
The authors include numerical experiments and pose an open question about the asymptotic behavior of Lipschitz-constant convergence in finite-width neural networks.

Abstract

Feature maps associated with positive definite kernels play a central role in kernel methods and learning theory, where regularity properties such as Lipschitz continuity are closely related to robustness and stability guarantees. Despite their importance, explicit characterizations of the Lipschitz constant of kernel feature maps are available only in a limited number of cases. In this paper, we study the Lipschitz regularity of feature maps associated with integral kernels under differentiability assumptions. We first provide sufficient conditions ensuring Lipschitz continuity and derive explicit formulas for the corresponding Lipschitz constants. We then identify a condition under which the feature map fails to be Lipschitz continuous and apply these results to several important classes of kernels. For infinite width two-layer neural network with isotropic Gaussian weight distributions, we show that the Lipschitz constant of the associated kernel can be expressed as the supremum of a two-dimensional integral, leading to an explicit characterization for the Gaussian kernel and the ReLU random neural network kernel. We also study continuous and shift-invariant kernels such as Gaussian, Laplace, and Mat\'ern kernels, which admit an interpretation as neural network with cosine activation function. In this setting, we prove that the feature map is Lipschitz continuous if and only if the weight distribution has a finite second-order moment, and we then derive its Lipschitz constant. Finally, we raise an open question concerning the asymptotic behavior of the convergence of the Lipschitz constant in finite width neural networks. Numerical experiments are provided to support this behavior.