Learning Shared Representations for Multi-Task Linear Bandits
arXiv cs.LG / 4/2/2026
📰 News
Key Points
- The paper studies multi-task linear bandits where T related tasks share a common low-dimensional latent representation, with a shared subspace of dimension r much smaller than d and T.
- It introduces a new OFUL-style algorithm that uses a two-stage pipeline—an exploration phase, spectral initialization to estimate the shared model, and then OFUL learning using a confidence set built from the low-rank structure.
- The authors provide theoretical results showing high-probability coverage of the true reward vectors by their constructed confidence set and derive cumulative regret bounds.
- The proposed method achieves a regret of O(√(drNT)) and is argued to substantially improve over treating each task independently, which yields O(dT√N).
- Numerical simulations are included to empirically validate performance across different problem sizes.
- categories: [