Deep-testing: the case of dependence detection

arXiv stat.ML / 4/30/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper investigates whether the success of deep learning in classification can be extended to statistical hypothesis testing, specifically distinguishing samples drawn from a null model versus outside it.
  • It introduces “deep-testing,” a procedure that trains a deep neural network on simulated data under both hypotheses to produce a learned classification-map test statistic.
  • By leveraging the network’s strong discriminative ability, deep-testing aims to construct highly powerful hypothesis tests.
  • As a proof of concept, the authors apply the method to independence testing and report that it achieves the best overall power in a large-scale simulation study against 19 competing methods across diverse dependence structures.

Abstract

Deep learning methods have proved highly effective for classification and image recognition problems. In this paper, we ask whether this success can be transferred to hypothesis testing: if a neural network can distinguish, for example, an image of a handwritten digit from another, can it also distinguish an "image of a sample" (such as a scatter plot) generated under a given statistical model from one generated outside that model? Motivated by this idea, we propose a novel procedure called deep-testing, which approaches the classical inferential problem of hypothesis testing through deep learning. More specifically, the test statistic is a classification map learned by a deep neural network from simulated data satisfying the null and alternative hypotheses, leveraging its strong discriminating power to construct a highly powerful test. As a proof of concept, we apply deep-testing to the problem of independence testing, arguably one of the most important problems in statistics. In a large-scale simulation study, deep-testing achieves the highest overall power against nineteen competing methods across a broad range of complex dependence structures, confirming the viability of the proposed approach.