Universal approximation property of Banach space-valued random feature models including random neural networks

arXiv stat.ML / 4/28/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a Banach space–valued extension of random feature learning for scalable supervised learning and large-scale kernel approximation, where only the linear readout is trained after random feature maps are initialized.
  • It establishes a universal approximation theorem for these Banach space–valued random feature models in the corresponding Bochner space, framing random feature models as Banach-valued random variables.
  • The authors derive approximation rates and provide an explicit algorithm for learning elements in the target Banach space using such models, including random Fourier/trigonometric regression.
  • The framework covers random neural networks (single-hidden-layer feedforward nets with randomly initialized weights/biases), extending the deterministic neural network universal approximation property to random networks and to non-compact-domain function spaces such as weighted and Sobolev spaces (including approximation of weak derivatives).
  • The work also studies training cost growth as a function of input/output dimension and the inverse of a tolerated approximation error, and includes numerical results showing empirical advantages of random feature models over deterministic ones.

Abstract

We introduce a Banach space-valued extension of random feature learning, a data-driven supervised machine learning technique for large-scale kernel approximation. By randomly initializing the feature maps, only the linear readout needs to be trained, which reduces the computational complexity substantially. Viewing random feature models as Banach space-valued random variables, we prove a universal approximation result in the corresponding Bochner space. Moreover, we derive approximation rates and an explicit algorithm to learn an element of the given Banach space by such models. The framework of this paper includes random trigonometric/Fourier regression and in particular random neural networks which are single-hidden-layer feedforward neural networks whose weights and biases are randomly initialized, whence only the linear readout needs to be trained. For the latter, we can then lift the universal approximation property of deterministic neural networks to random neural networks, even within function spaces over non-compact domains, e.g., weighted spaces, L^p-spaces, and (weighted) Sobolev spaces, where the latter includes the approximation of the (weak) derivatives. In addition, we analyze when the training costs for approximating a given function grow polynomially in both the input/output dimension and the reciprocal of a pre-specified tolerated approximation error. Furthermore, we demonstrate in a numerical example the empirical advantages of random feature models over their deterministic counterparts.