Separating Geometry from Probability in the Analysis of Generalization

arXiv cs.LG / 4/22/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper reframes machine-learning generalization by moving away from the usual (and untestable) assumption that in-sample and out-of-sample data are i.i.d. draws from an infinite population.
It derives generalization bounds using sensitivity analysis of optimization solutions under perturbations to the problem data, yielding deterministic, variational-principle style relationships between in-sample and out-of-sample error.
The key error term measures how close out-of-sample data are to in-sample data, allowing the approach to connect performance across train/test settings.
Statistical assumptions are then applied only ex post, to explain when that error term is small on average or with high probability.
Overall, the method separates the “geometry” of the optimization/sensitivity behavior from the “probability” of data similarity, aiming for a more principled analysis of generalization.

Abstract

The goal of machine learning is to find models that minimize prediction error on data that has not yet been seen. Its operational paradigm assumes access to a dataset

S

and articulates a scheme for evaluating how well a given model performs on an arbitrary sample. The sample can be

S

(in which case we speak of ``in-sample'' performance) or some entirely new

S'

(in which case we speak of ``out-of-sample'' performance). Traditional analysis of generalization assumes that both in- and out-of-sample data are i.i.d.\ draws from an infinite population. However, these probabilistic assumptions cannot be verified even in principle. This paper presents an alternative view of generalization through the lens of sensitivity analysis of solutions of optimization problems to perturbations in the problem data. Under this framework, generalization bounds are obtained by purely deterministic means and take the form of variational principles that relate in-sample and out-of-sample evaluations through an error term that quantifies how close out-of-sample data are to in-sample data. Statistical assumptions can then be used \textit{ex post} to characterize the situations when this error term is small (either on average or with high probability).

The 67th Attempt: When Your "Knowledge Management" System Becomes a Self-Fulfilling Prophecy of Excellence

Dev.to

Context Engineering for Developers: A Practical Guide (2026)

Dev.to

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.

Dev.to

I Built an AI Image Workflow with GPT Image 2.0 (+ Fixing Its Biggest Flaw)

Dev.to

Max-and-Omnis/Nemotron-3-Super-64B-A12B-Math-REAP-GGUF

Reddit r/LocalLLaMA

Separating Geometry from Probability in the Analysis of Generalization

Key Points

Abstract

Related Articles

The 67th Attempt: When Your "Knowledge Management" System Becomes a Self-Fulfilling Prophecy of Excellence

Context Engineering for Developers: A Practical Guide (2026)

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.

I Built an AI Image Workflow with GPT Image 2.0 (+ Fixing Its Biggest Flaw)

Max-and-Omnis/Nemotron-3-Super-64B-A12B-Math-REAP-GGUF

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer