Black-Box Optimization From Small Offline Datasets via Meta Learning with Synthetic Tasks

arXiv cs.LG / 4/15/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses offline black-box optimization under small, scarce, or low-quality datasets common in scientific design tasks like molecules and materials discovery.
  • It identifies a core limitation of prior methods: surrogate models must learn the optimization bias (correctly ranking candidates), but doing so is difficult with limited data.
  • The proposed OptBias method uses meta-learning by generating synthetic optimization tasks from a Gaussian process to learn a reusable optimization-bias representation.
  • OptBias then fine-tunes the surrogate model on the limited real dataset for the target task, improving performance across both continuous and discrete benchmarks.
  • Experimental results show OptBias consistently outperforms state-of-the-art offline optimization baselines specifically in small-data regimes, suggesting practical robustness for realistic offline settings.

Abstract

We consider the problem of offline black-box optimization, where the goal is to discover optimal designs (e.g., molecules or materials) from past experimental data. A key challenge in this setting is data scarcity: in many scientific applications, only small or poor-quality datasets are available, which severely limits the effectiveness of existing algorithms. Prior work has theoretically and empirically shown that performance of offline optimization algorithms depends on how well the surrogate model captures the optimization bias (i.e., ability to rank input designs correctly), which is challenging to accomplish with limited experimental data. This paper proposes Surrogate Learning with Optimization Bias via Synthetic Task Generation (OptBias), a meta-learning framework that directly tackles data scarcity. OptBias learns a reusable optimization bias by training on synthetic tasks generated from a Gaussian process, and then fine-tunes the surrogate model on the small data for the target task. Across diverse continuous and discrete offline optimization benchmarks, OptBias consistently outperforms state-of-the-art baselines in small data regimes. These results highlight OptBias as a robust and practical solution for offline optimization in realistic small data settings.