Improving reproducibility by controlling random seed stability in machine learning based estimation via bagging
arXiv stat.ML / 4/21/2026
💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisModels & Research
Key Points
- The paper studies how variability across different random seeds in machine learning can destabilize downstream debiased machine learning estimators.
- It defines “random seed stability” using a concentration condition and proves that subbagging can ensure stability for any bounded-outcome regression algorithm.
- The authors propose a new debiased machine learning workflow called “adaptive cross-bagging,” designed to remove seed dependence from both nuisance estimation and sample splitting through a modified cross-fitting procedure.
- Numerical experiments show the proposed method meets the desired stability target, while baseline alternatives either fail to achieve the same level of stability or require much higher computational costs.
- Compared with standard practice, the approach introduces only a small additional computation overhead, whereas competing methods can incur large penalties.
Related Articles

A practical guide to getting comfortable with AI coding tools
Dev.to

Every time a new model comes out, the old one is obsolete of course
Reddit r/LocalLLaMA

We built it during the NVIDIA DGX Spark Full-Stack AI Hackathon — and it ended up winning 1st place overall 🏆
Dev.to

Stop Losing Progress: Setting Up a Pro Jupyter Workflow in VS Code (No More Colab Timeouts!)
Dev.to

🚀 Major BrowserAct CLI Update
Dev.to