Subspace Optimization for Efficient Federated Learning under Heterogeneous Data

arXiv cs.LG / 4/29/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses federated learning challenges in the large-model regime, where communication, memory, and computation are limited and non-IID client data can cause harmful training drift.
  • It proposes Subspace Subspace Optimization for Federated Learning (SSF), which performs heterogeneity-corrected updates in a low-dimensional subspace using only projected quantities to reduce overhead.
  • SSF retains full-dimensional control information via a backfill-style mechanism that preserves residual components when the active subspace changes, aiming to maintain stability and effectiveness.
  • Under smoothness and bounded-variance assumptions, SSF achieves a non-asymptotic convergence rate on the order of Ō(1/T + 1/√(NKT)).
  • Experiments indicate SSF offers strong accuracy–efficiency trade-offs on heterogeneous data compared with existing approaches.

Abstract

Federated learning increasingly operates in a large-model regime where communication, memory, and computation are all scarce. Typically, non-IID client data induce drift that degrades the stability and performance of local training. Existing remedies such as SCAFFOLD introduce heterogeneity-correction mechanisms to address this challenge, but they incur substantial extra communication and memory overhead. This paper proposes a subspace optimization method for federated learning (SSF), which performs heterogeneity-corrected optimization in a low-dimensional subspace using only projected quantities, while preserving full-dimensional control information through a backfill-style update that retains residual components whenever the active subspace changes. Under standard smoothness and bounded-variance assumptions, SSF attains a non-asymptotic rate of order \widetilde{\mathcal{O}}(1/T+1/\sqrt{NKT}). Experiments show favorable accuracy--efficiency trade-offs under heterogeneous data.