Dense Neural Networks are not Universal Approximators

arXiv stat.ML / 4/17/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper challenges the usual “universal approximation” intuition by proving that dense (fully connected) neural networks are not universal approximators under realistic constraints on weights and network dimensions.
  • It uses a model-compression style argument, combining the weak regularity lemma with a reinterpretation of feedforward networks as message-passing graph neural networks.
  • For ReLU networks with constrained weight values and inputs/outputs, the authors show there exist Lipschitz-continuous functions that cannot be approximated by these dense architectures.
  • The results indicate intrinsic limitations of dense layers and motivate sparse connectivity as an essential ingredient to recover universality-like approximation behavior.
  • Overall, the work reframes approximation capability as being strongly dependent on architectural restrictions rather than just network size.

Abstract

We investigate the approximation capabilities of dense neural networks. While universal approximation theorems establish that sufficiently large architectures can approximate arbitrary continuous functions if there are no restrictions on the weight values, we show that dense neural networks do not possess this universality. Our argument is based on a model compression approach, combining the weak regularity lemma with an interpretation of feedforward networks as message passing graph neural networks. We consider ReLU neural networks subject to natural constraints on weights and input and output dimensions, which model a notion of dense connectivity. Within this setting, we demonstrate the existence of Lipschitz continuous functions that cannot be approximated by such networks. This highlights intrinsic limitations of neural networks with dense layers and motivates the use of sparse connectivity as a necessary ingredient for achieving true universality.