Abstract
We derive generalization error bounds for the training of two-layer neural networks without assuming boundedness of the loss function, using Wasserstein distance estimates on the discrepancy between a probability distribution and its associated empirical measure, together with moment bounds for the associated stochastic gradient method. In the case of independent test data, we obtain a dimension-free rate of order O(n^{-1/2} ) on the n-sample generalization error, whereas without independence assumption, we derive a bound of order O(n^{-1 / ( d_{\rm in}+d_{\rm out} )} ), where d_{\rm in}, d_{\rm out} denote input and output dimensions. Our bounds and their coefficients can be explicitly computed prior to the training of the model, and are confirmed by numerical simulations.