FedSEA: Achieving Benefit of Parallelization in Federated Online Learning

arXiv cs.LG / 4/22/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper studies online federated learning (OFL) where standard adversary assumptions often block any advantage from parallelization and do not model statistical variation sources well.
  • It introduces a Stochastically Extended Adversary (SEA) model that, while keeping the loss function fixed across clients over time, allows the adversary to independently and dynamically choose each client’s data distribution at every time step.
  • The authors propose the 2OFL algorithm, combining online stochastic gradient descent on clients with periodic global aggregation at the server.
  • They prove global network regret bounds, including an O(sqrt(T)) rate for smooth convex losses and an O(log(T)) rate for smooth strongly convex losses.
  • The analysis separates spatial (across clients) and temporal (over time) heterogeneity effects and identifies a mild-temporal-variation regime where regret actually improves with parallelization, tightening prior pessimistic results.

Abstract

Online federated learning (OFL) has emerged as a popular framework for decentralized decision-making over continuous data streams without compromising client privacy. However, the adversary model assumed in standard OFL typically precludes any potential benefits of parallelization. Further, it fails to adequately capture the different sources of statistical variation in OFL problems. In this paper, we extend the OFL paradigm by integrating a stochastically extended adversary (SEA). Under this framework, the loss function remains fixed across clients over time. However, the adversary dynamically and independently selects the data distribution for each client at each time. We propose the \algoOFL{} algorithm to solve this problem, which utilizes online stochastic gradient descent at the clients, along with periodic global aggregation via the server. We establish bounds on the global network regret over a time horizon \(T\) for two classes of functions: (1) for smooth and convex losses, we prove an \(\mathcal{O}(\sqrt{T})\) bound, and (2) for smooth and strongly convex losses, we prove an \(\mathcal{O}(\log T)\) bound. Through careful analysis, we quantify the individual impact of both spatial (across clients) and temporal (over time) data heterogeneity on the regret bounds. Consequently, we identify a regime of mild temporal variation (relative to stochastic gradient variance), where the network regret improves with parallelization. Hence, in the SEA setting, our results improve the existing pessimistic worst-case results in online federated learning.

FedSEA: Achieving Benefit of Parallelization in Federated Online Learning | AI Navigate