Analytical Correction for Subsampling Bias in Drifting Models

arXiv cs.LG / 5/1/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper shows that, in drifting one-step generative models, using minibatch samples to approximate centroids produces a biased estimator due to softmax self-normalization, with a pointwise bias of order O(1/n).
Because correcting the bias would require an intractable expectation over the full underlying distributions, the authors introduce Analytical Bias Correction (ABC) as a closed-form plug-in adjustment estimated from in-batch statistics.
Theoretical results prove ABC reduces the bias scaling from O(1/n) to O(1/n^2), does not increase total variance at first order, and keeps the corrected centroid within the original convex hull.
Experiments (including CIFAR-10) confirm the predicted bias-scaling behavior and show that ABC improves FID and training speed, especially when the minibatch size n is small.

Abstract

Drifting models are capable one-step generative models trained to follow a drifting field. The field combines attractive and repulsive softmax-weighted centroids over the data and current-generator distributions. In practice, only a minibatch of

n

samples from each distribution is available, and each centroid is approximated by an empirical estimate. In this paper, we begin by showing that the minibatch centroid is in general a biased estimator of the target centroid, with a pointwise

O(1/n)

bias arising from softmax self-normalization. Correcting this bias requires the expectation over the full distribution, which is intractable. We instead approximate the leading bias term from in-batch statistics and propose Analytical Bias Correction (ABC), a closed-form plug-in adjustment. We prove that ABC reduces the bias from

O(1/n)

O(1/n^2)

, introduces no first-order increase in total variance, and preserves convex-hull containment of the corrected centroid. In practice, ABC requires only two additional lines of code and has negligible wall-time overhead under compiled execution. Toy experiments confirm the theoretical

O(1/n)

and

O(1/n^2)

scaling. On CIFAR-10, ABC reduces FID and trains faster, with the largest gains at small

n

, where the bias is most significant.

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Dev.to

Why Enterprise AI Pilots Fail

Dev.to

The PDF Feature Nobody Asked For (That I Use Every Day)

Dev.to

How to Fix OpenClaw Tool Calling Issues

Dev.to

Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model

THE DECODER

Analytical Correction for Subsampling Bias in Drifting Models

Key Points

Abstract

Related Articles

Why Autonomous Coding Agents Keep Failing — And What Actually Works

Why Enterprise AI Pilots Fail

The PDF Feature Nobody Asked For (That I Use Every Day)

How to Fix OpenClaw Tool Calling Issues

Mistral's new flagship Medium 3.5 folds chat, reasoning, and code into one model

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer