Federated fairness-aware classification under differential privacy

arXiv stat.ML / 3/26/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies how differential privacy and algorithmic fairness interact in federated learning for demographic-disparity-constrained classification.
  • It proposes a two-step federated algorithm called FDP-Fair, and a computationally lightweight single-server variant called CDP-Fair.
  • Under mild structural assumptions, the authors prove theoretical guarantees covering privacy, fairness, and bounds on excess risk.
  • The analysis breaks down the “private fairness-aware excess risk” into four contributors: intrinsic classification cost, private classification cost, non-private fairness cost, and private fairness cost.
  • Experiments on synthetic and real datasets support the practicality of the proposed methods and validate the theoretical behavior.

Abstract

Privacy and algorithmic fairness have become two central issues in modern machine learning. Although each has separately emerged as a rapidly growing research area, their joint effect remains comparatively under-explored. In this paper, we systematically study the joint impact of differential privacy and fairness on classification in a federated setting, where data are distributed across multiple servers. Targeting demographic disparity constrained classification under federated differential privacy, we propose a two-step algorithm, namely FDP-Fair. In the special case where there is only one server, we further propose a simple yet powerful algorithm, namely CDP-Fair, serving as a computationally-lightweight alternative. Under mild structural assumptions, theoretical guarantees on privacy, fairness and excess risk control are established. In particular, we disentangle the source of the private fairness-aware excess risk into a) intrinsic cost of classification, b) cost of private classification, c) non-private cost of fairness and d) private cost of fairness. Our theoretical findings are complemented by extensive numerical experiments on both synthetic and real datasets, highlighting the practicality of our designed algorithms.