Sparse-Aware Neural Networks for Nonlinear Functionals: Mitigating the Exponential Dependence on Dimension

arXiv cs.LG / 4/9/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies operator learning with deep neural networks in infinite-dimensional function spaces, focusing on mitigating poor scaling with dimension and improving stability from discrete data.
  • It introduces a framework that uses convolutional architectures to learn sparse features from a limited number of samples and deep fully connected networks to approximate nonlinear functionals.
  • Using universal discretization methods, the authors prove that sparse approximators support stable recovery under both deterministic and random sampling schemes.
  • Theoretical results show improved approximation rates and reduced sample requirements across function spaces characterized by fast frequency decay and mixed smoothness, offering insights into how sparsity alleviates the curse of dimensionality.
  • The work positions sparsity as a key mechanism for better sample efficiency and interpretability in functional learning theory.

Abstract

Deep neural networks have emerged as powerful tools for learning operators defined over infinite-dimensional function spaces. However, existing theories frequently encounter difficulties related to dimensionality and limited interpretability. This work investigates how sparsity can help address these challenges in functional learning, a central ingredient in operator learning. We propose a framework that employs convolutional architectures to extract sparse features from a finite number of samples, together with deep fully connected networks to effectively approximate nonlinear functionals. Using universal discretization methods, we show that sparse approximators enable stable recovery from discrete samples. In addition, both the deterministic and the random sampling schemes are sufficient for our analysis. These findings lead to improved approximation rates and reduced sample sizes in various function spaces, including those with fast frequency decay and mixed smoothness. They also provide new theoretical insights into how sparsity can alleviate the curse of dimensionality in functional learning.