Budget-Constrained Causal Bandits: Bridging Uplift Modeling and Sequential Decision-Making

arXiv cs.LG / 4/30/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper tackles budget-constrained treatment allocation in digital advertising, where advertisers must decide ad exposure under limited budgets while accounting for heterogeneous treatment effects.
It proposes Budget-Constrained Causal Bandits (BCCB), an online sequential framework that jointly learns ad effectiveness per user, explores uncertain responders, and paces spending over time.
Unlike a common two-stage offline pipeline (HTE estimation followed by constrained optimization), BCCB is designed to work in cold-start scenarios with little or no historical data.
Experiments on the Criteo Uplift dataset (from a real randomized controlled trial) show a data-efficiency crossover: offline methods need around 10,000 historical observations for reliable performance, while BCCB works effectively from the first user.
BCCB also yields 3–5x lower variance across runs and outperforms baseline online approaches (including standard and budgeted Thompson Sampling) as well as greedy uplift estimation across tested budget levels.

Abstract

Treatment allocation under budget constraints is a central challenge in digital advertising: advertisers must decide which users to show ads to while spending a limited budget wisely. The standard approach follows a two-stage offline pipeline - first collect historical data to estimate heterogeneous treatment effects (HTE), then solve a constrained optimization to allocate the budget. This works well with abundant data, but fails in cold-start settings such as new campaigns, new markets, or new customer segments where little historical data exists. We propose Budget-Constrained Causal Bandits (BCCB), an online framework that learns which users respond to ads while simultaneously spending the budget, making treatment decisions one user at a time. BCCB unifies three components into a single sequential process: learning individual-level ad effectiveness, exploring users whose response is uncertain, and pacing the budget over time. We evaluated on the Criteo Uplift dataset, a large-scale advertising dataset from a real randomized controlled trial. Our key finding is a data-efficiency crossover: offline methods require approximately 10,000 historical observations to produce reliable results, while BCCB operates effectively from the very first user. Furthermore, BCCB exhibits 3-5x lower performance variance between runs, making it more practical for real campaign planning. Among purely online methods, BCCB consistently outperforms standard Thompson Sampling, budgeted Thompson Sampling, and greedy HTE estimation across all budget levels tested.

💡 Insights using this article

This article is featured in our daily AI news digest — key takeaways and action items at a glance.

📅 4/30DailyView insight →

Can AI Predict Pollution Before It Happens? The Smart Solution to an Old Problem

Dev.to

THE FIFTH TRANSMISSION: THE GRADIENT IS THE GOVERNMENT

Reddit r/artificial

Looking for feedback on OpenVidya: an open-source AI classroom layer for NCERT/CBSE [R]

Reddit r/MachineLearning

RAG Series (1): Why LLMs Need External Memory

Dev.to

One Open Source Project a Day (No. 54): Warp - The AI-Native Rust Terminal

Dev.to

Budget-Constrained Causal Bandits: Bridging Uplift Modeling and Sequential Decision-Making

Key Points

Abstract

💡 Insights using this article

Related Articles

Can AI Predict Pollution Before It Happens? The Smart Solution to an Old Problem

THE FIFTH TRANSMISSION: THE GRADIENT IS THE GOVERNMENT

Looking for feedback on OpenVidya: an open-source AI classroom layer for NCERT/CBSE [R]

RAG Series (1): Why LLMs Need External Memory

One Open Source Project a Day (No. 54): Warp - The AI-Native Rust Terminal

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer