Beyond ReLU: How Activations Affect Neural Kernels and Random Wide Networks

arXiv stat.ML / 4/28/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies how activation functions beyond ReLU affect neural tangent kernels (NTK) and neural network Gaussian process kernels (NNGP), focusing on activations whose only non-smoothness occurs at zero.
It characterizes the RKHS (reproducing kernel Hilbert space) associated with these kernels and extends existing NTK/NNGP theory to activations such as SELU, ELU, and LeakyReLU.
The authors analyze variants and special cases including architectures with missing biases, two-layer networks, and polynomial activations.
Results indicate that many non-infinitely-smooth activations yield equivalent RKHSs across different depths—depending mainly on the “degree” of non-smoothness—whereas polynomial activations exhibit depth-dependent RKHS behavior.
The work also derives smoothness properties of NNGP sample paths, characterizing the smoothness of infinitely wide neural networks at initialization.

Abstract

In recent years, the neural tangent kernel (NTK) and neural network Gaussian process kernel (NNGP) have given theoreticians tractable limiting cases of fully connected neural networks. However, the property of these kernels are poorly understood for activation functions other than powers of the ReLU. Our main contribution is a characterization of the RKHS of these kernels for activation functions whose only non-smoothness is at zero. This extends existing theory to numerous commonly used activation functions such as SELU, ELU, or LeakyReLU. Additionally, we analyze a broad set of special cases such as missing biases, two-layer networks, or polynomial activations. Our results show that a broad class of not infinitely smooth activations generate equivalent RKHSs at different network depths, depending only on the degree of the non-smoothness up to equivalence. On the other hand, the RKHS generated by polynomial activations depends on the network depth. Finally, we derive results for the smoothness of NNGP sample paths, characterizing the smoothness of infinitely wide neural networks at initialization.

What to Build Still Beats How

Dev.to

From Claim Denials to Smart Decisions: My Experience Using AI in Healthcare Claims Processing

Dev.to

v0.22.1

Ollama Releases

AI created job descriptions

Reddit r/artificial

Predictive Compliance: How AI Identifies Your Med Spa's Documentation Risks

Dev.to

Beyond ReLU: How Activations Affect Neural Kernels and Random Wide Networks

Key Points

Abstract

Related Articles

What to Build Still Beats How

From Claim Denials to Smart Decisions: My Experience Using AI in Healthcare Claims Processing

v0.22.1

AI created job descriptions

Predictive Compliance: How AI Identifies Your Med Spa's Documentation Risks

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer