LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization

arXiv stat.ML / 4/10/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper introduces the Prompt Duel Optimizer (PDO), a sample-efficient framework for label-free LLM prompt optimization that relies on pairwise preference feedback from an LLM judge rather than costly ground-truth labels.
PDO formulates prompt search as a dueling-bandit problem and uses Double Thompson Sampling to choose the most informative prompt comparisons within a fixed judge budget.
It also employs top-performer guided mutation to iteratively expand the candidate prompt set while pruning weaker prompts to improve efficiency.
Experiments on BIG-bench Hard (BBH) and MS MARCO indicate PDO finds better prompts than label-free baselines and achieves strong quality–cost trade-offs when comparison budgets are limited.

Abstract

Large language models (LLMs) are highly sensitive to prompts, but most automatic prompt optimization (APO) methods assume access to ground-truth references (e.g., labeled validation data) that are costly to obtain. We propose the Prompt Duel Optimizer (PDO), a sample-efficient framework for label-free prompt optimization based on pairwise preference feedback from an LLM judge. PDO casts prompt selection as a dueling-bandit problem and combines (i) Double Thompson Sampling to prioritize informative comparisons under a fixed judge budget, with (ii) top-performer guided mutation to expand the candidate pool while pruning weak prompts. Experiments on BIG-bench Hard (BBH) and MS MARCO show that PDO consistently identifies stronger prompts than label-free baselines, while offering favorable quality--cost trade-offs under constrained comparison budgets.

CIA is trusting AI to help analyze intel from human spies

Reddit r/artificial

LLM API Pricing in 2026: I Put Every Major Model in One Table

Dev.to

i generated AI video on a GTX 1660. here's what it actually takes.

Dev.to

Meta-Optimized Continual Adaptation for planetary geology survey missions for extreme data sparsity scenarios

Dev.to

How To Optimize Enterprise AI Energy Consumption

Dev.to

LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization

Key Points

Abstract

Related Articles

CIA is trusting AI to help analyze intel from human spies

LLM API Pricing in 2026: I Put Every Major Model in One Table

i generated AI video on a GTX 1660. here's what it actually takes.

Meta-Optimized Continual Adaptation for planetary geology survey missions for extreme data sparsity scenarios

How To Optimize Enterprise AI Energy Consumption

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer