Is Supervised Learning Really That Different from Unsupervised?
arXiv stat.ML / 3/30/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper argues that supervised learning can be reframed as a two-stage process: first choosing model parameters via an unsupervised criterion, then incorporating the labels y into the outputs without altering the learned parameters.
- It introduces a new model selection criterion that, unlike cross-validation, can be used even when the labels y are unavailable.
- For linear ridge regression, the authors derive bounds on the asymptotic out-of-sample risk relative to the optimal asymptotic risk.
- Experiments and analysis suggest that multiple methods—covering linear and kernel ridge regression, smoothing splines, k-NN, random forests, and neural networks—trained without access to y can match standard supervised counterparts in performance.
- Overall, the results imply that the conceptual gap between supervised and unsupervised learning may be less fundamental than commonly assumed.
Related Articles

Knowledge Governance For The Agentic Economy.
Dev.to

AI server farms heat up the neighborhood for miles around, paper finds
The Register
Does the Claude “leak” actually change anything in practice?
Reddit r/LocalLLaMA

87.4% of My Agent's Decisions Run on a 0.8B Model
Dev.to

AIエージェントをソフトウェアチームに変える無料ツール「Paperclip」
Dev.to