Meritocratic Fairness in Budgeted Combinatorial Multi-armed Bandits via Shapley Values

arXiv cs.LG / 5/4/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces a new fairness framework for budgeted combinatorial multi-armed bandits with full-bandit feedback (BCMAB-FBF), where individual arm contributions are not directly observable.
To quantify arm contributions under this harder feedback setting, the authors extend Shapley values to a K-Shapley value that limits marginal contributions to coalitions of size at most K and prove it is uniquely characterized by key cooperative-game properties.
They propose K-SVFair-FBF, a fairness-aware bandit algorithm that adaptively estimates K-Shapley values when the valuation function is unknown.
The method is designed to both learn the valuation function from full feedback and reduce Monte Carlo approximation noise, with a theoretical fairness-regret guarantee of O(T^{3/4}).
Experiments on federated learning and social influence maximization datasets show improved fairness and effectiveness compared with existing baselines.

Abstract

We propose a new framework for meritocratic fairness in budgeted combinatorial multi-armed bandits with full-bandit feedback (BCMAB-FBF). Unlike semi-bandit feedback, the contribution of individual arms is not received in full-bandit feedback, making the setting significantly more challenging. To compute arm contributions in BCMAB-FBF, we first extend the Shapley value, a classical solution concept from cooperative game theory, to the

K

-Shapley value, which captures the marginal contribution of an agent restricted to a set of size at most

K

. We show that

K

-Shapley value is a unique solution concept that satisfies Symmetry, Linearity, Null player, and efficiency properties. We next propose K-SVFair-FBF, a fairness-aware bandit algorithm that adaptively estimates

K

-Shapley value with unknown valuation function. Unlike standard bandit literature on full bandit feedback, K-SVFair-FBF not only learns the valuation function under full feedback setting but also mitigates the noise arising from Monte Carlo approximations. Theoretically, we prove that K-SVFair-FBF achieves

O(T^{3/4})

regret bound on fairness regret. Through experiments on federated learning and social influence maximization datasets, we demonstrate that our approach achieves fairness and performs more effectively than existing baselines.

AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs

Anthropic News

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI

The Verge

CLMA Frame Test

Dev.to

You Are Right — You Don't Need CLAUDE.md

Dev.to

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions

Dev.to

Meritocratic Fairness in Budgeted Combinatorial Multi-armed Bandits via Shapley Values

Key Points

Abstract

Related Articles

AnnouncementsBuilding a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs

Dara Khosrowshahi on replacing Uber drivers — and himself — with AI

CLMA Frame Test

You Are Right — You Don't Need CLAUDE.md

Governance and Liability in AI Agents: What I Built Trying to Answer Those Questions

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer