Decentralized Machine Learning with Centralized Performance Guarantees via Gibbs Algorithms

arXiv stat.ML / 4/23/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper shows that decentralized learning can match centralized performance without sharing clients’ local datasets by using Gibbs-measure communication between clients.
  • Under an ERM-RER (empirical risk minimization with relative-entropy regularization) framework and a forward-backward client communication scheme, sharing locally obtained Gibbs measures is sufficient to reproduce centralized ERM-RER performance.
  • The method uses each client’s Gibbs measure as a reference measure for the next client, providing a principled mechanism to encode prior information through reference measures.
  • Achieving centralized-level performance in the decentralized setting depends on carefully scaling the regularization factors in proportion to local sample sizes.
  • The work suggests a new paradigm for decentralized learning that replaces data sharing with sharing local inductive bias via reference measures over model sets.

Abstract

In this paper, it is shown, for the first time, that centralized performance is achievable in decentralized learning without sharing the local datasets. Specifically, when clients adopt an empirical risk minimization with relative-entropy regularization (ERM-RER) learning framework and a forward-backward communication between clients is established, it suffices to share the locally obtained Gibbs measures to achieve the same performance as that of a centralized ERM-RER with access to all the datasets. The core idea is that the Gibbs measure produced by client~k is used, as reference measure, by client~k+1. This effectively establishes a principled way to encode prior information through a reference measure. In particular, achieving centralized performance in the decentralized setting requires a specific scaling of the regularization factors with the local sample sizes. Overall, this result opens the door to novel decentralized learning paradigms that shift the collaboration strategy from sharing data to sharing the local inductive bias via the reference measures over the set of models.