Optimal Routing for Federated Learning over Dynamic Satellite Networks: Tractable or Not?

arXiv cs.LG / 4/22/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper studies federated learning (FL) over dynamic, relay-based satellite networks, where each FL round requires both distributing the global model and collecting client updates via routing decisions.
  • It performs a rigorous tractability analysis across multiple global-distribution and local-collection settings, varying factors such as the number of models, objective functions, routing modes (unicast vs. multicast), and whether flows are splittable.
  • For local model collection, the analysis further considers how client selection and flow splittability affect computational complexity and optimality.
  • The authors prove, case by case, whether the globally optimal routing can be found in polynomial time or whether the problem becomes NP-hard, thereby mapping clear “tractable vs. intractable” regimes.
  • The resulting efficient algorithms are positioned as directly applicable to in-orbit FL when the problem falls into tractable regimes, while the NP-hard results provide fundamental guidance on why some routing designs may be infeasible to optimize exactly.

Abstract

Federated learning (FL) is a key paradigm for distributed model learning across decentralized data sources. Communication in each FL round typically consists of two phases: (i) distributing the global model from a server to clients, and (ii) collecting updated local models from clients to the server for aggregation. This paper focuses on a type of FL where communication between a client and the server is relay-based over dynamic networks, making routing optimization essential. A typical scenario is in-orbit FL, where satellites act as clients and communicate with a server (which can be a satellite, ground station, or aerial platform) via multi-hop inter-satellite links. This paper presents a comprehensive tractability analysis of routing optimization for in-orbit FL under different settings. For global model distribution, these include the number of models, the objective function, and routing schemes (unicast versus multicast, and splittable versus unsplittable flow). For local model collection, the settings consider the number of models, client selection, and flow splittability. For each case, we rigorously prove whether the global optimum is obtainable in polynomial time or the problem is NP-hard. Together, our analysis draws clear boundaries between tractable and intractable regimes for a broad spectrum of routing problems for in-orbit FL. For tractable cases, the derived efficient algorithms are directly applicable in practice. For intractable cases, we provide fundamental insights into their inherent complexity. These contributions fill a critical yet unexplored research gap, laying a foundation for principled routing design, evaluation, and deployment in satellite-based FL or similar distributed learning systems.