Learning to Query History: Nonstationary Classification via Learned Retrieval

arXiv cs.LG / 4/9/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes reframing nonstationary classification as time-series prediction by conditioning decisions on a sequence of historical labeled examples rather than only the current input.
It introduces an end-to-end trained learned discrete retrieval module that selects relevant historical instances using input-dependent queries, enabling scalable retrieval from long histories.
The retrieval mechanism is optimized jointly with the classifier using a score-based gradient estimator, avoiding the need to load all history into GPU memory during training and deployment.
Experiments on synthetic benchmarks and the Amazon Reviews 23 electronics category demonstrate improved robustness to distribution shifts versus standard classifiers.
The authors report that VRAM usage scales predictably with the length of the retrieved history sequence, supporting practical deployment with large stored corpora.

Abstract

Nonstationarity is ubiquitous in practical classification settings, leading deployed models to perform poorly even when they generalize well to holdout sets available at training time. We address this by reframing nonstationary classification as time series prediction: rather than predicting from the current input alone, we condition the classifier on a sequence of historical labeled examples that extends beyond the training cutoff. To scale to large sequences, we introduce a learned discrete retrieval mechanism that samples relevant historical examples via input-dependent queries, trained end-to-end with the classifier using a score-based gradient estimator. This enables the full corpus of historical data to remain on an arbitrary filesystem during training and deployment. Experiments on synthetic benchmarks and Amazon Reviews '23 (electronics category) show improved robustness to distribution shift compared to standard classifiers, with VRAM scaling predictably as the length of the historical data sequence increases.

Why Anthropic’s new model has cybersecurity experts rattled

Reddit r/artificial

Does the AI 2027 paper still hold any legitimacy?

Reddit r/artificial

Why Most Productivity Systems Fail (And What to Do Instead)

Dev.to

Moving from proof of concept to production: what we learned with Nometria

Dev.to

Frontend Engineers Are Becoming AI Trainers

Dev.to

Learning to Query History: Nonstationary Classification via Learned Retrieval

Key Points

Abstract

Related Articles

Why Anthropic’s new model has cybersecurity experts rattled

Does the AI 2027 paper still hold any legitimacy?

Why Most Productivity Systems Fail (And What to Do Instead)

Moving from proof of concept to production: what we learned with Nometria

Frontend Engineers Are Becoming AI Trainers

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer