Aggregation Alignment for Federated Learning with Mixture-of-Experts under Data Heterogeneity
arXiv cs.LG / 2026/3/24
💬 オピニオンIdeas & Deep AnalysisModels & Research
要点
- The paper studies how to fine-tune mixture-of-experts (MoE) LLMs with federated learning (FL) under non-IID, privacy-sensitive data, where client heterogeneity breaks standard parameter aggregation.
- It identifies two key FL-specific aggregation failures: inconsistent gating preferences that yield a “one-size-fits-none” global router, and mismatched semantic roles of same-index experts that blur specialization.
- To fix this, it proposes FedAlign-MoE, which aligns routing behavior across clients using routing/distribution consistency weighting and regularization for more stable global gating.
- It also introduces semantic consistency measurement for same-index experts and selectively aggregates updates only from semantically aligned clients to preserve expert specialization.
- Experiments reportedly show FedAlign-MoE improves over prior methods with faster convergence and higher accuracy in non-IID federated settings.