A Theoretical Framework for Energy-Aware Gradient Pruning in Federated Learning

arXiv cs.LG / 3/25/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses federated learning’s communication and energy constraints by noting that conventional Top-K magnitude pruning reduces payload but is energy-agnostic in practice.
  • It reformulates pruning as an energy-constrained projection problem that incorporates hardware-level differences between memory- and compute-related costs after backpropagation.
  • The proposed Cost-Weighted Magnitude Pruning (CWMP) selects updates by balancing update magnitude against their physical cost, rather than magnitude alone.
  • The authors show CWMP is an optimal greedy solution to the constrained projection and provide probabilistic analysis of its global energy efficiency.
  • Experiments on non-IID CIFAR-10 indicate CWMP achieves a better performance–energy tradeoff (Pareto frontier) than the Top-K baseline.

Abstract

Federated Learning (FL) is constrained by the communication and energy limitations of decentralized edge devices. While gradient sparsification via Top-K magnitude pruning effectively reduces the communication payload, it remains inherently energy-agnostic. It assumes all parameter updates incur identical downstream transmission and memory-update costs, ignoring hardware realities. We formalize the pruning process as an energy-constrained projection problem that accounts for the hardware-level disparities between memory-intensive and compute-efficient operations during the post-backpropagation phase. We propose Cost-Weighted Magnitude Pruning (CWMP), a selection rule that prioritizes parameter updates based on their magnitude relative to their physical cost. We demonstrate that CWMP is the optimal greedy solution to this constrained projection and provide a probabilistic analysis of its global energy efficiency. Numerical results on a non-IID CIFAR-10 benchmark show that CWMP consistently establishes a superior performance-energy Pareto frontier compared to the Top-K baseline.