A unifying view of contrastive learning, importance sampling, and bridge sampling for energy-based models

arXiv stat.ML / 4/10/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes a unified framework for training and parameter estimation in energy-based models (EBMs) by linking noise contrastive estimation (NCE), reverse logistic regression (RLR), multiple importance sampling (MIS), and bridge sampling under a common perspective.
It shows that these seemingly different estimators can be equivalent when certain conditions hold, clarifying how existing EBM inference methods relate to one another.
The authors use this unifying view to explain why NCE is often flexible and robust, while also outlining specific scenarios where its performance can be improved.
Beyond synthesizing prior methods, the work introduces the potential for new estimators and aims to improve both statistical efficiency and computational efficiency.
Reproducibility is supported by releasing the MATLAB code used in the numerical experiments.

Abstract

In the last decades, energy-based models (EBMs) have become an important class of probabilistic models in which a component of the likelihood is intractable and therefore cannot be evaluated explicitly. Consequently, parameter estimation in EBMs is challenging for conventional inference methods. In this work, we provide a unified framework that connects noise contrastive estimation (NCE), reverse logistic regression (RLR), multiple importance sampling (MIS), and bridge sampling within the context of EBMs. We further show that these methods are equivalent under specific conditions. This unified perspective clarifies relationships among existing methods and enables the development of new estimators, with the potential to improve statistical and computational efficiency. Furthermore, this study helps elucidate the success of NCE in terms of its flexibility and robustness, while also identifying scenarios in which its performance can be further improved. Hence, rather than being a purely descriptive review, this work offers a unifying perspective and additional methodological contributions. The MATLAB code used in the numerical experiments is also made freely available to support the reproducibility of the results.