On Gossip Algorithms for Machine Learning with Pairwise Objectives

arXiv cs.LG / 2026/3/26

📰 ニュースIdeas & Deep AnalysisModels & Research

要点

  • The paper studies gossip-based machine learning methods for scenarios common in IoT networks where data are distributed and constrained by storage, computation, and communication limits.
  • Unlike most prior work that assumes objectives based on simple averages of individual observations, it focuses on pairwise objectives modeled as degree-two U-statistics.
  • The authors target motivating tasks such as similarity learning, ranking, and clustering, and revisit gossip algorithms tailored to these pairwise U-statistic objectives.
  • A comprehensive theoretical convergence framework is developed, including refined upper and lower bounds to explain when the methods succeed.
  • The analysis identifies specific graph properties that most strongly determine efficiency, highlighting how network topology affects performance.

Abstract

In the IoT era, information is more and more frequently picked up by connected smart sensors with increasing, though limited, storage, communication and computation abilities. Whether due to privacy constraints or to the structure of the distributed system, the development of statistical learning methods dedicated to data that are shared over a network is now a major issue. Gossip-based algorithms have been developed for the purpose of solving a wide variety of statistical learning tasks, ranging from data aggregation over sensor networks to decentralized multi-agent optimization. Whereas the vast majority of contributions consider situations where the function to be estimated or optimized is a basic average of individual observations, it is the goal of this article to investigate the case where the latter is of pairwise nature, taking the form of a U -statistic of degree two. Motivated by various problems such as similarity learning, ranking or clustering for instance, we revisit gossip algorithms specifically designed for pairwise objective functions and provide a comprehensive theoretical framework for their convergence. This analysis fills a gap in the literature by establishing conditions under which these methods succeed, and by identifying the graph properties that critically affect their efficiency. In particular, a refined analysis of the convergence upper and lower bounds is performed.