Bandits on graphs and structures

arXiv cs.LG / 5/6/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The thesis focuses on structural properties of sequential decision problems to make solutions more practical for deployment.
  • It highlights action spaces representable as graphs, studying graph bandits under reward smoothness (spectral bandits), side observations, and influence maximization.
  • It extends to very large structured domains, including kernel bandits and polymatroid bandits, as well as bandits for function optimization with unknown smoothness.
  • The work also addresses infinitely many-arms bandits, covering settings where the action space may be exponential or even infinite in size.
  • Overall, it is positioned as a survey of the author’s contributions to graph-based and structured bandit problems.

Abstract

The goal of this thesis is to investigate the structural properties of certain sequential problems in order to bring the solutions closer to a practical use. In the first part, we put a special emphasis on structures that can be represented as graphs on actions. In the second part, we study the large action spaces that can be of exponential size in the number of base actions or even infinite. For graph bandits, we consider the settings of smoothness of rewards (spectral bandits), side observations, and influence maximization. For large structured domains, we cover kernel bandits, polymatroid bandits, bandits for function optimization (including unknown smoothness), and infinitely many-arms bandits. The thesis aspires to be a survey of the author's contributions on graph and structured bandits.