FedRG: Unleashing the Representation Geometry for Federated Learning with Noisy Clients

arXiv cs.LG / 3/23/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • FedRG introduces a representation-geometry-based method for robust federated learning with noisy clients, addressing label noise beyond scalar loss.
  • The approach creates label-agnostic spherical representations via self-supervision and fits a spherical von Mises-Fisher mixture model to capture semantic clusters using previously identified clean samples.
  • It integrates a semantic-label soft mapping to compute a distribution divergence between label-free and annotated-label spaces, enabling robust noisy-sample identification and iterative model updates.
  • The method also applies a personalized noise absorption matrix on noisy labels to bolster optimization, with extensive experiments showing notable improvements over state-of-the-art FL methods across heterogeneous data and noisy client scenarios.

Abstract

Federated learning (FL) suffers from performance degradation due to the inevitable presence of noisy annotations in distributed scenarios. Existing approaches have advanced in distinguishing noisy samples from the dataset for label correction by leveraging loss values. However, noisy samples recognition relying on scalar loss lacks reliability for FL under heterogeneous scenarios. In this paper, we rethink this paradigm from a representation perspective and propose \method~(\textbf{Fed}erated under \textbf{R}epresentation \textbf{G}emometry), which follows \textbf{the principle of ``representation geometry priority''} to recognize noisy labels. Firstly, \method~creates label-agnostic spherical representations by using self-supervision. It then iteratively fits a spherical von Mises-Fisher (vMF) mixture model to this geometry using previously identified clean samples to capture semantic clusters. This geometric evidence is integrated with a semantic-label soft mapping mechanism to derive a distribution divergence between the label-free and annotated label-conditioned feature space, which robustly identifies noisy samples and updates the vMF mixture model with the newly separated clean dataset. Lastly, we employ an additional personalized noise absorption matrix on noisy labels to achieve robust optimization. Extensive experimental results demonstrate that \method~significantly outperforms state-of-the-art methods for FL with data heterogeneity under diverse noisy clients scenarios.