Learning Expressive Priors for Generalization and Uncertainty Estimation in Neural Networks

arXiv stat.ML / 3/31/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces a prior-learning method that uses scalable, structured posteriors from neural networks as informative priors to improve generalization and uncertainty estimation.
It claims the learned priors yield expressive probabilistic representations at large scale, functioning like Bayesian analogs of pre-trained models (e.g., ImageNet) while producing non-vacuous generalization bounds.
The approach is extended to continual learning, arguing that the priors’ properties are beneficial for learning across tasks without losing favorable generalization/uncertainty behavior.
Key technical enablers include efficient sums-of-Kronecker-product computations and tractable objective derivations/optimizations designed to tighten generalization bounds.
Extensive experiments are reported to demonstrate the method’s effectiveness for both uncertainty estimation and generalization.

Abstract

In this work, we propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks. The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees. Our learned priors provide expressive probabilistic representations at large scale, like Bayesian counterparts of pre-trained models on ImageNet, and further produce non-vacuous generalization bounds. We also extend this idea to a continual learning framework, where the favorable properties of our priors are desirable. Major enablers are our technical contributions: (1) the sums-of-Kronecker-product computations, and (2) the derivations and optimizations of tractable objectives that lead to improved generalization bounds. Empirically, we exhaustively show the effectiveness of this method for uncertainty estimation and generalization.

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

Reddit r/MachineLearning

BYOK is not just a pricing model: why it changes AI product trust

Dev.to

AI Citation Registries and Identity Persistence Across Records

Dev.to

Building Real-Time AI Voice Agents with Google Gemini 3.1 Flash Live and VideoSDK

Dev.to

Your Knowledge, Your Model: A Method for Deterministic Knowledge Externalization

Dev.to

Learning Expressive Priors for Generalization and Uncertainty Estimation in Neural Networks

Key Points

Abstract

Related Articles

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

BYOK is not just a pricing model: why it changes AI product trust

AI Citation Registries and Identity Persistence Across Records

Building Real-Time AI Voice Agents with Google Gemini 3.1 Flash Live and VideoSDK

Your Knowledge, Your Model: A Method for Deterministic Knowledge Externalization

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer