Who Trains Matters: Federated Learning under Enrollment and Participation Selection Biases
arXiv cs.LG / 4/30/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- The paper shows that federated learning can suffer from two distinct selection biases—enrollment bias (who is ever eligible/reachable) and participation bias (who actually participates each round)—which can break the representativeness assumption behind FL training.
- It formalizes federated learning under a two-stage client selection model and introduces FedIPW, an inverse-probability-weighted aggregation method to recover target-population mean updates under standard ignorability/positivity assumptions.
- Since covariates for non-enrolled clients are often missing, it also proposes a limited-information aggregate-calibration extension that reweights enrolled clients using known target-population summaries to partially correct enrollment bias.
- The authors analyze algorithm-agnostic optimization effects and find that incomplete selection correction can leave a persistent (non-vanishing) bias floor.
- Experiments with synthetic federated logistic regression confirm the objective mismatch predicted by theory and demonstrate that enrollment correction reduces target-population error under two-stage selection.
Related Articles
Vector DB and ANN vs PHE conflict, is there a practical workaround? [D]
Reddit r/MachineLearning

Agent Amnesia and the Case of Henry Molaison
Dev.to

Azure Weekly: Microsoft and OpenAI Restructure Partnership as GPT-5.5 Lands in Foundry
Dev.to

Proven Patterns for OpenAI Codex in 2026: Prompts, Validation, and Gateway Governance
Dev.to

Vibe coding is a tool, not a shortcut. Most people are using it wrong.
Dev.to