Cross-Fitting-Free Debiased Machine Learning with Multiway Dependence

arXiv stat.ML / 4/7/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper presents an asymptotic theory for two-step debiased machine learning (DML) estimators in GMM settings with general multiway clustered (multiway) dependence, specifically avoiding the need for cross-fitting.
  • It argues that cross-fitting can be statistically inefficient and computationally costly when first-stage learners are complex and when the effective sample size is limited by the number of independent clusters.
  • The authors achieve valid inference without sample splitting by using Neyman-orthogonal moment conditions together with a localisation-based empirical process method that supports an arbitrary number of clustering dimensions.
  • The resulting debiased GMM estimators are proven to be asymptotically linear and asymptotically normal under multiway clustered dependence.
  • A key contribution is the development of new global and local maximal inequalities for function classes defined on sums of separately exchangeable arrays, which may be useful beyond the immediate DML application.

Abstract

This paper develops an asymptotic theory for two-step debiased machine learning (DML) estimators in generalised method of moments (GMM) models with general multiway clustered dependence, without relying on cross-fitting. While cross-fitting is commonly employed, it can be statistically inefficient and computationally burdensome when first-stage learners are complex and the effective sample size is governed by the number of independent clusters. We show that valid inference can be achieved without sample splitting by combining Neyman-orthogonal moment conditions with a localisation-based empirical process approach, allowing for an arbitrary number of clustering dimensions. The resulting debiased GMM estimators are shown to be asymptotically linear and asymptotically normal under multiway clustered dependence. A central technical contribution of the paper is the derivation of novel global and local maximal inequalities for general classes of functions of sums of separately exchangeable arrays, which underpin our theoretical arguments and are of independent interest.