Quantifying and Mitigating Self-Preference Bias of LLM Judges

arXiv cs.LG / 4/28/2026

📰 NewsIdeas & Deep AnalysisModels & Research

共有:

Key Points

LLM-as-a-Judge evaluation can be distorted by Self-Preference Bias (SPB), where an LLM systematically favors or disfavors outputs it generated during judging.
Prior SPB measurements are often too expensive and may mix up a model’s generative ability with its evaluative stance, limiting practical large-scale use.
The paper proposes a fully automated framework that builds equal-quality response pairs to statistically separate true discriminability from bias propensity without needing human gold annotations.
Experiments on 20 mainstream LLMs suggest that higher capabilities are frequently uncorrelated with (or even negatively correlated with) low SPB, meaning strong models are not automatically fair judges.
To mitigate SPB, the authors introduce a structured, multi-dimensional evaluation approach based on cognitive load decomposition, reducing SPB by an average of 31.5%.

Abstract

LLM-as-a-Judge has become a dominant approach in automated evaluation systems, playing critical roles in model alignment, leaderboard construction, quality control, and so on. However, the scalability and trustworthiness of this approach can be substantially distorted by Self-Preference Bias (SPB), which is a directional evaluative deviation in which LLMs systematically favor or disfavor their own generated outputs during evaluation. Existing measurements rely on costly human annotations and conflate generative capability with evaluative stance, and thus are impractical for large-scale deployment in real-world systems. To address this issue, we introduce a fully automated framework to quantifying and mitigating SPB, which constructs equal-quality pairs of responses with negligible quality differences, enabling statistical disentanglement of discriminability from bias propensity without human gold standards. Empirical analysis across 20 mainstream LLMs reveals that advanced capabilities are often uncorrelated, or even negatively correlated, with low SPB. To mitigate this bias, we propose a structured multi-dimensional evaluation strategy grounded in cognitive load decomposition, which reduces SPB by 31.5\% on average.

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

Same Agent, Different Risk | How Microsoft 365 Copilot Grounding Changes the Security Model | Rahsi Framework™

Dev.to

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System

Dev.to

How We Built an Ambient AI Clinical Documentation Pipeline (and Saved Doctors 8+ Hours a Week)

Dev.to

🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

Dev.to

Quantifying and Mitigating Self-Preference Bias of LLM Judges

Key Points

Abstract

Related Articles

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Same Agent, Different Risk | How Microsoft 365 Copilot Grounding Changes the Security Model | Rahsi Framework™

Claude Haiku for Low-Cost AI Inference: Patterns from a Horse Racing Prediction System

How We Built an Ambient AI Clinical Documentation Pipeline (and Saved Doctors 8+ Hours a Week)

🦀 PicoClaw Deep Dive — A Field Guide to Building an Ultra-Light AI Agent in Go 🐹

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer