Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook

arXiv cs.LG / 4/9/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper argues that existing LLM cultural value alignment benchmarks suffer from the C^3 challenge, since they often use multiple-choice, discriminative formats that measure value knowledge rather than genuine value orientations.
It introduces DOVE, a distributional evaluation framework that compares human-written text distributions to LLM-generated outputs instead of relying on fixed-choice probing.
DOVE builds a compact value codebook from 10K documents using a rate-distortion variational optimization objective to reduce semantic noise and map text into a structured value space.
Alignment is quantified with unbalanced optimal transport to reflect intra-cultural distributional structure and sub-group diversity, addressing heterogeneity across cultures.
Experiments across 12 LLMs report improved predictive validity, reaching a 31.56% correlation with downstream tasks, and show strong reliability with as few as 500 samples per culture.

Abstract

As LLMs are globally deployed, aligning their cultural value orientations is critical for safety and user engagement. However, existing benchmarks face the Construct-Composition-Context (

C^3

) challenge: relying on discriminative, multiple-choice formats that probe value knowledge rather than true orientations, overlook subcultural heterogeneity, and mismatch with real-world open-ended generation. We introduce DOVE, a distributional evaluation framework that directly compares human-written text distributions with LLM-generated outputs. DOVE utilizes a rate-distortion variational optimization objective to construct a compact value-codebook from 10K documents, mapping text into a structured value space to filter semantic noise. Alignment is measured using unbalanced optimal transport, capturing intra-cultural distributional structures and sub-group diversity. Experiments across 12 LLMs show that DOVE achieves superior predictive validity, attaining a 31.56% correlation with downstream tasks, while maintaining high reliability with as few as 500 samples per culture.

Meta Superintelligence Lab Releases Muse Spark: A Multimodal Reasoning Model With Thought Compression and Parallel Agents

MarkTechPost

Chatbots are great at manipulating people to buy stuff, Princeton boffins find

The Register

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Dev.to

v0.20.5

Ollama Releases

Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos

Dev.to

Distributional Open-Ended Evaluation of LLM Cultural Value Alignment Based on Value Codebook

Key Points

Abstract

Related Articles

Meta Superintelligence Lab Releases Muse Spark: A Multimodal Reasoning Model With Thought Compression and Parallel Agents

Chatbots are great at manipulating people to buy stuff, Princeton boffins find

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

v0.20.5

Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer