Distributed Gradient Clustering: Convergence and the Effect of Initialization

arXiv cs.LG / 2026/3/24

💬 オピニオンIdeas & Deep AnalysisModels & Research

要点

  • The paper analyzes how different initializations of cluster centers affect the convergence and performance of distributed, neighbor-to-neighbor gradient-based clustering algorithms operating over connected user networks.
  • Through extensive numerical experiments, the authors show their distributed methods are generally more robust to initialization effects than centralized gradient clustering approaches.
  • Motivated by the K-means++ idea, the study introduces a novel distributed center initialization scheme intended to improve clustering outcomes over baseline random initialization.
  • The work focuses on achieving global clustering of jointly distributed data using only local datasets and limited communication with immediate neighbors, highlighting algorithmic resilience and practical initialization strategies.

Abstract

We study the effects of center initialization on the performance of a family of distributed gradient-based clustering algorithms introduced in [1], that work over connected networks of users. In the considered scenario, each user contains a local dataset and communicates only with its immediate neighbours, with the aim of finding a global clustering of the joint data. We perform extensive numerical experiments, evaluating the effects of center initialization on the performance of our family of methods, demonstrating that our methods are more resilient to the effects of initialization, compared to centralized gradient clustering [2]. Next, inspired by the K-means++ initialization [3], we propose a novel distributed center initialization scheme, which is shown to improve the performance of our methods, compared to the baseline random initialization.

Distributed Gradient Clustering: Convergence and the Effect of Initialization | AI Navigate