AdaBoN: Adaptive Best-of-N Alignment
arXiv cs.CL / 3/16/2026
💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- AdaBoN proposes a prompt-adaptive Best-of-N alignment strategy to allocate inference-time compute more efficiently for language-model alignment.
- The method uses a two-stage algorithm: an initial exploratory phase that estimates reward distributions for each prompt with a small budget, and a second stage that adaptively allocates the remaining budget.
- Empirical results on prompts from the AlpacaEval, HH-RLHF, and PKU-SafeRLHF datasets across 12 LM/RM pairs and 50 prompt batches show the adaptive strategy outperforms uniform allocation with the same budget.
- The approach remains competitive against uniform allocations with 20% larger budgets and benefits more as batch size increases.




