Bootstrapping with AI/ML-generated labels

arXiv stat.ML / 4/28/2026

💬 OpinionModels & Research

Key Points

  • The paper analyzes how AI/ML-generated binary labels used as regression covariates can cause significant bias in OLS estimates and break standard inference when label misclassification is present.
  • It shows that a “fixed-label” bootstrap that resimulates using estimated labels but still uses a corrupted label version during estimation is generally invalid unless a strong independence condition holds.
  • The authors introduce a “coupled-label bootstrap” that jointly resamples the true labels and the imputed labels, proving it yields valid inference without requiring that strong independence condition.
  • They further propose two finite-sample enhancements—variance correction for uncertainty in misclassification rates and a Hessian rotation for near-singular designs—to improve coverage.
  • The methods are validated via simulations and demonstrated on an economics application examining the relationship between wages and remote work status.

Abstract

AI/ML methods are increasingly used in economics to generate binary variables (or labels) via classification algorithms. When these generated variables are included as covariates in regressions, even small misclassification errors can induce large biases in OLS estimators and invalidate standard inference. We study whether the bootstrap can correct this bias and deliver valid inference. We first show that a seemingly natural fixed-label bootstrap, which generates data using estimated labels but relies on a corrupted version in estimation, is generally invalid unless a strong independence condition between the latent true labels and other covariates holds. We then propose a coupled-label bootstrap that jointly resamples the true and imputed labels, and show it is valid without this condition. Two finite-sample adjustments further improve coverage: a variance correction for uncertainty in estimated misclassification rates and a Hessian rotation for near-singular designs. We illustrate the methods in simulations and apply them to investigate the relationship between wages and remote work status.