BOAT: Navigating the Sea of In Silico Predictors for Antibody Design via Multi-Objective Bayesian Optimization

arXiv cs.LG / 4/16/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces BOAT, a plug-and-play Bayesian optimization framework aimed at multi-objective antibody lead optimization across multiple predicted properties simultaneously.
  • BOAT combines uncertainty-aware surrogate modeling with a genetic algorithm to efficiently explore antibody sequence space while reducing reliance on resource-intensive sequential filtering pipelines.
  • The authors benchmark BOAT against genetic algorithms and newer generative learning methods for multi-objective protein optimization and report competitive performance with state-of-the-art approaches.
  • The study delineates when surrogate-driven optimization is likely to outperform expensive generative approaches and highlights practical limits related to sequence dimensionality and oracle (evaluation) costs.

Abstract

Antibody lead optimization is inherently a multi-objective challenge in drug discovery. Achieving a balance between different drug-like properties is crucial for the development of viable candidates, and this search becomes exponentially challenging as desired properties grow. The ever-growing zoo of sophisticated in silico tools for predicting antibody properties calls for an efficient joint optimization procedure to overcome resource-intensive sequential filtering pipelines. We present BOAT, a versatile Bayesian optimization framework for multi-property antibody engineering. Our `plug-and-play' framework couples uncertainty-aware surrogate modeling with a genetic algorithm to jointly optimize various predicted antibody traits while enabling efficient exploration of sequence space. Through systematic benchmarking against genetic algorithms and newer generative learning approaches, we demonstrate competitive performance with state-of-the-art methods for multi-objective protein optimization. We identify clear regimes where surrogate-driven optimization outperforms expensive generative approaches and establish practical limits imposed by sequence dimensionality and oracle costs.