Nearly Optimal Best Arm Identification for Semiparametric Bandits

arXiv stat.ML / 4/7/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper studies fixed-confidence best arm identification (BAI) in semiparametric bandits where rewards are linear in arm features plus an unknown additive baseline shift, distinguishing it from standard linear-bandit BAI.
For the transductive case, it proves an attainable instance-dependent lower bound that matches the linear-bandit complexity computed on shifted features.
It introduces a computationally efficient phase-elimination algorithm using a new $XY$-design to enable orthogonalized regression in this semiparametric setting.
The authors derive a nearly optimal high-probability upper bound on sample complexity, with performance matching the lower bound up to logarithmic factors and an additive $d^2$ term.
Experiments on synthetic data and the Jester dataset report clear improvements over prior baselines.

Abstract

We study fixed-confidence Best Arm Identification (BAI) in semiparametric bandits, where rewards are linear in arm features plus an unknown additive baseline shift. Unlike linear-bandit BAI, this setting requires orthogonalized regression, and its instance-optimal sample complexity has remained open. For the transductive setting, we establish an attainable instance-dependent lower bound characterized by the corresponding linear-bandit complexity on shifted features. We then propose a computationally efficient phase-elimination algorithm based on a new

XY

-design for orthogonalized regression. Our analysis yields a nearly optimal high-probability sample-complexity upper bound, up to log factors and an additive

d^2

term, and experiments on synthetic instances and the Jester dataset show clear gains over prior baselines.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Could it be that this take is not too far fetched?

Reddit r/LocalLLaMA

npm audit Is Broken — Here's the Claude Code Skill I Built to Fix It

Dev.to

Meta Launches Muse Spark: A New AI Model for Everyday Use

Dev.to

TurboQuant on a MacBook: building a one-command local stack with Ollama, MLX, and an automatic routing proxy

Dev.to

Nearly Optimal Best Arm Identification for Semiparametric Bandits

Key Points

Abstract

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Could it be that this take is not too far fetched?

npm audit Is Broken — Here's the Claude Code Skill I Built to Fix It

Meta Launches Muse Spark: A New AI Model for Everyday Use

TurboQuant on a MacBook: building a one-command local stack with Ollama, MLX, and an automatic routing proxy

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer