Few Batches or Little Memory, But Not Both: Simultaneous Space and Adaptivity Constraints in Stochastic Bandits

arXiv cs.LG / 3/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

It studies stochastic multi-armed bandits under simultaneous space (W bits of memory) and adaptivity (B batches) constraints, showing this regime breaks the mild results known when each constraint is considered separately.
It proves a lower bound: any algorithm with W-bit memory must use at least Ω(K/W) batches to achieve near-minimax regret ~Ō(√(KT)), even when using adaptive grids.
It shows that logarithmic memory rules cannot achieve K-independent batch complexity under these combined constraints.
It introduces an information bottleneck argument showing near-minimax regret requires Ω(K) bits of information about the good-arm set, while the B×W information budget limits achievable information to O(BW).
It provides an algorithm that uses O(log T) bits of memory and ~Ō(K) batches to achieve regret ~Ō(√(KT)), nearly matching the derived lower bound.

Abstract

We study stochastic multi-armed bandits under simultaneous constraints on space and adaptivity: the learner interacts with the environment in

B

batches and has only

W

bits of persistent memory. Prior work shows that each constraint alone is surprisingly mild: near-minimax regret

\widetilde{O}(\sqrt{KT})

is achievable with

O(\log T)

bits of memory under fully adaptive interaction, and with a

K

-independent

O(\log\log T)

-type number of batches when memory is unrestricted. We show that this picture breaks down in the simultaneously constrained regime. We prove that any algorithm with a

W

-bit memory constraint must use at least

\Omega(K/W)

batches to achieve near-minimax regret

\widetilde{O}(\sqrt{KT})

, even under adaptive grids. In particular, logarithmic memory rules out

K

-independent batch complexity. Our proof is based on an information bottleneck. We show that near-minimax regret forces the learner to acquire

\Omega(K)

bits of information about the hidden set of good arms under a suitable hard prior, whereas an algorithm with

B

batches and

W

bits of memory allows only

O(BW)

bits of information. A key ingredient is a localized change-of-measure lemma that yields probability-level arm exploration guarantees, which is of independent interest. We also give an algorithm using

O(\log T)

bits of memory and

\widetilde{O}(K)

batches that achieves regret

\widetilde{O}(\sqrt{KT})

, which nearly matches our lower bound.

What 81,000 people want from AI

Anthropic News

ラピダス、半導体設計AIエージェント「国内2社海外1社が使用中」

日経XTECH

「AIで雇用は増える」「AIの進化はツールがけん引」、5つのAI潮流を解説

日経XTECH

中国AI企業が他社製AIを「ただ乗り蒸留」か米社が主張、安全保障リスクも

日経XTECH

Superposition and the Capsule: Quantum State Collapse Meets AI Identity

Dev.to

Few Batches or Little Memory, But Not Both: Simultaneous Space and Adaptivity Constraints in Stochastic Bandits

Key Points

Abstract

Related Articles

What 81,000 people want from AI

ラピダス、半導体設計AIエージェント「国内2社海外1社が使用中」

「AIで雇用は増える」「AIの進化はツールがけん引」、5つのAI潮流を解説

中国AI企業が他社製AIを「ただ乗り蒸留」か米社が主張、安全保障リスクも

Superposition and the Capsule: Quantum State Collapse Meets AI Identity

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

Key Points

Abstract

Related Articles

What 81,000 people want from AI

ラピダス、半導体設計AIエージェント「国内2社海外1社が使用中」

「AIで雇用は増える」「AIの進化はツールがけん引」、5つのAI潮流を解説

中国AI企業が他社製AIを「ただ乗り蒸留」か 米社が主張、安全保障リスクも

Superposition and the Capsule: Quantum State Collapse Meets AI Identity

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer

中国AI企業が他社製AIを「ただ乗り蒸留」か米社が主張、安全保障リスクも