CoPA: Benchmarking Personalized Question Answering with Data-Informed Cognitive Factors

arXiv cs.CL / 4/17/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

共有:

Key Points

The paper argues that evaluating personalization in question answering is still a bottleneck, since prior approaches often use lexical similarity or manual heuristics without strong data-driven validation.
It introduces Community-Individual Preference Divergence (CIPD) to extract six personalization factors from situations where individual preferences override group consensus.
The authors propose CoPA, a benchmark built from 1,985 user profiles, enabling fine-grained evaluation at the level of those factors.
CoPA assesses how well model outputs align with user-specific cognitive preferences inferred from interaction patterns, aiming to be more discriminative than generic QA metrics.
The work includes released code on GitHub for applying the benchmark and related evaluation methods.

Abstract

While LLMs have demonstrated remarkable potential in Question Answering (QA), evaluating personalization remains a critical bottleneck. Existing paradigms predominantly rely on lexical-level similarity or manual heuristics, often lacking sufficient data-driven validation. We address this by mining Community-Individual Preference Divergence (CIPD), where individual choices override consensus, to distill six key personalization factors as evaluative dimensions. Accordingly, we introduce CoPA, a benchmark with 1,985 user profiles for fine-grained, factor-level assessment. By quantifying the alignment between model outputs and user-specific cognitive preferences inferred from interaction patterns, CoPA provides a more comprehensive and discriminative standard for evaluating personalized QA than generic metrics. The code is available at https://github.com/bjzgcai/CoPA.

FastAPI With LangChain and MongoDB

Dev.to

[Patterns] AI Agent Error Handling That Actually Works

Dev.to

Building ONNX Embedding Workflows in Oracle AI Database with Python

Dev.to

🌱 Green Habit Tracker

Dev.to

[2026] OpenTelemetry for LLM Observability — Self-Hosted Setup

Dev.to

CoPA: Benchmarking Personalized Question Answering with Data-Informed Cognitive Factors

Key Points

Abstract

Related Articles

FastAPI With LangChain and MongoDB

[Patterns] AI Agent Error Handling That Actually Works

Building ONNX Embedding Workflows in Oracle AI Database with Python

🌱 Green Habit Tracker

[2026] OpenTelemetry for LLM Observability — Self-Hosted Setup

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer