Federated Hierarchical Clustering with Automatic Selection of Optimal Cluster Numbers

arXiv cs.AI / 3/16/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

Fed-$k^*$-HC is a federated clustering framework that automatically determines the optimal number of clusters k* using hierarchical clustering and micro-subcluster prototypes generated by clients.
It addresses the challenges of unknown cluster counts, imbalanced cluster sizes, and privacy-preserving transmission constraints in federated learning.
Prototypes from clients are uploaded to a server and merged hierarchically through a density-based merging design to explore clusters of varying sizes and shapes.
The merging process progresses until it self-terminates based on neighboring relationships among prototypes to determine k*.
Experiments on diverse datasets demonstrate the method's capability to accurately identify a proper number of clusters in federated clustering.

Abstract

Federated Clustering (FC) is an emerging and promising solution in exploring data distribution patterns from distributed and privacy-protected data in an unsupervised manner. Existing FC methods implicitly rely on the assumption that clients are with a known number of uniformly sized clusters. However, the true number of clusters is typically unknown, and cluster sizes are naturally imbalanced in real scenarios. Furthermore, the privacy-preserving transmission constraints in federated learning inevitably reduce usable information, making the development of robust and accurate FC extremely challenging. Accordingly, we propose a novel FC framework named Fed-

k^*

-HC, which can automatically determine an optimal number of clusters

k^*

based on the data distribution explored through hierarchical clustering. To obtain the global data distribution for

k^*

determination, we let each client generate micro-subclusters. Their prototypes are then uploaded to the server for hierarchical merging. The density-based merging design allows exploring clusters of varying sizes and shapes, and the progressive merging process can self-terminate according to the neighboring relationships among the prototypes to determine

k^*

. Extensive experiments on diverse datasets demonstrate the FC capability of the proposed Fed-

k^*

-HC in accurately exploring a proper number of clusters.

The programming passion is melting

Dev.to

Maximize Developer Revenue with Monetzly's Innovative API for AI Conversations

Dev.to

Co-Activation Pattern Detection for Prompt Injection: A Mechanistic Interpretability Approach Using Sparse Autoencoders

Reddit r/LocalLLaMA

How to Train Custom Language Models: Fine-Tuning vs Training From Scratch (2026)

Dev.to

KoboldCpp 1.110 - 3 YR Anniversary Edition, native music gen, qwen3tts voice cloning and more

Reddit r/LocalLLaMA

Federated Hierarchical Clustering with Automatic Selection of Optimal Cluster Numbers

Key Points

Abstract

Related Articles

The programming passion is melting

Maximize Developer Revenue with Monetzly's Innovative API for AI Conversations

Co-Activation Pattern Detection for Prompt Injection: A Mechanistic Interpretability Approach Using Sparse Autoencoders

How to Train Custom Language Models: Fine-Tuning vs Training From Scratch (2026)

KoboldCpp 1.110 - 3 YR Anniversary Edition, native music gen, qwen3tts voice cloning and more

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer