Model Evolution Under Zeroth-Order Optimization: A Neural Tangent Kernel Perspective
arXiv cs.LG / 3/24/2026
💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research
Key Points
- The paper studies zeroth-order (ZO) optimization for neural networks, where gradients are estimated using only forward passes and backpropagation is avoided to save memory.
- It introduces the Neural Zeroth-order Kernel (NZK) to characterize how neural models evolve in function space under ZO updates, addressing the difficulty caused by noisy stochastic gradient estimates.
- For linear models, the authors prove that the expected NZK is invariant during training and derive a closed-form model evolution under squared loss based on moments of the random perturbation directions.
- The analysis extends to linearized neural networks, interpreting ZO updates as a form of kernel gradient descent under the NZK framework.
- Experiments on MNIST, CIFAR-10, and Tiny ImageNet support the theory and show convergence acceleration when using a single shared random vector.
Related Articles

Composer 2: What is new and Compares with Claude Opus 4.6 & GPT-5.4
Dev.to
How UCP Breaks Your E-Commerce Tracking Stack: A Platform-by-Platform Analysis
Dev.to
AI Text Analyzer vs Asking Friends: Which Gives Better Perspective?
Dev.to
[D] Cathie wood claims ai productivity wave is starting, data shows 43% of ceos save 8+ hours weekly
Reddit r/MachineLearning

Microsoft hires top AI researchers from Allen Institute for AI for Suleyman's Superintelligence team
THE DECODER