Sample Selection Using Multi-Task Autoencoders in Federated Learning with Non-IID Data
arXiv cs.LG / 4/30/2026
💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research
Key Points
- The paper proposes federated learning sample-selection techniques to reduce the impact of redundant, malicious, abnormal, and noisy training samples that can degrade accuracy and efficiency.
- It introduces a multi-task autoencoder framework that estimates each image sample’s contribution using loss and feature analysis, coupled with unsupervised outlier detection methods.
- Clients use central-server-managed filtering based on OCSVM, isolation forest (IF), and an adaptive loss threshold (AT), while feature-based selection is further improved via a centrally controlled multi-class deep SVDD loss.
- Experiments on CIFAR-10 and MNIST under varying client counts, non-IID data distributions, and noise levels up to 40% show accuracy gains of up to 7.02% (CIFAR-10) and 1.83% (MNIST) using loss-based selection, plus up to 0.99% improvement on CIFAR-10 from federated SVDD loss.
- Overall, the results indicate the proposed methods are effective and robust across different federated settings and noise conditions.
- categories": ["models-research", "dev-stack-infra", "ideas-deep-analysis"]
Related Articles
Vector DB and ANN vs PHE conflict, is there a practical workaround? [D]
Reddit r/MachineLearning

Agent Amnesia and the Case of Henry Molaison
Dev.to

Azure Weekly: Microsoft and OpenAI Restructure Partnership as GPT-5.5 Lands in Foundry
Dev.to

Proven Patterns for OpenAI Codex in 2026: Prompts, Validation, and Gateway Governance
Dev.to

Vibe coding is a tool, not a shortcut. Most people are using it wrong.
Dev.to