Bandits attack function optimization

arXiv cs.LG / 5/6/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper treats function optimization as a sequential decision-making problem subject to a limited evaluation budget, restricting how many times the objective can be queried.
  • It proposes Simultaneous Optimistic Optimization (SOO), a deterministic algorithm that partitions the search domain and balances exploration with exploitation to find potential global maxima.
  • SOO is motivated by a continuous analogue of multi-armed bandit strategies, using an initial quasi-uniform search for exploration and local optimization for exploitation.
  • The authors claim guarantees on the quality of the returned solution and improved numerical efficiency compared with naive strategies.
  • Empirical results are provided on the CEC'2014 single-objective real-parameter numerical optimization benchmark suite to assess SOO’s performance.

Abstract

We consider function optimization as a sequential decision making problem under budget constraint. This constraint limits the number of objective function evaluations allowed during the optimization. We consider an algorithm inspired by a continuous version of a multi-armed bandit problem which attacks this optimization problem by solving the tradeoff between exploration (initial quasi-uniform search of the domain) and exploitation (local optimization around the potentially global maxima). We introduce the so-called Simultaneous Optimistic Optimization (SOO), a deterministic algorithm that works by domain partitioning. The benefit of such approach are the guarantees on the returned solution and the numerical efficiency of the algorithm. We present this machine learning approach to optimization, and provide the empirical assessment of SOO on the CEC'2014 competition on single objective real-parameter numerical optimization test-suite.