Refining Covariance Matrix Estimation in Stochastic Gradient Descent Through Bias Reduction

arXiv stat.ML / 4/24/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper investigates online inference and asymptotic covariance estimation for stochastic gradient descent (SGD), focusing on limits of existing estimators like plug-in and batch-means methods.
  • Classical approaches either rely on inaccessible second-order information (e.g., the Hessian) or converge too slowly to be practically useful.
  • The authors introduce a fully online, bias-reduced (de-biased) covariance estimator that avoids any need for second-order derivatives.
  • The proposed method achieves an improved convergence rate of n^((α-1)/2) * sqrt(log n), and is reported to outperform other Hessian-free covariance estimation alternatives.
  • Overall, the work provides a more accurate and more efficient route to covariance estimation in SGD without requiring second-order computation.

Abstract

We study online inference and asymptotic covariance estimation for the stochastic gradient descent (SGD) algorithm. While classical methods (such as plug-in and batch-means estimators) are available, they either require inaccessible second-order (Hessian) information or suffer from slow convergence. To address these challenges, we propose a novel, fully online de-biased covariance estimator that eliminates the need for second-order derivatives while significantly improving estimation accuracy. Our method employs a bias-reduction technique to achieve a convergence rate of n^{(\alpha-1)/2} \sqrt{\log n}, outperforming existing Hessian-free alternatives.