CoPA: Benchmarking Personalized Question Answering with Data-Informed Cognitive Factors

arXiv cs.CL / 4/17/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper argues that evaluating personalization in question answering is still a bottleneck, since prior approaches often use lexical similarity or manual heuristics without strong data-driven validation.
  • It introduces Community-Individual Preference Divergence (CIPD) to extract six personalization factors from situations where individual preferences override group consensus.
  • The authors propose CoPA, a benchmark built from 1,985 user profiles, enabling fine-grained evaluation at the level of those factors.
  • CoPA assesses how well model outputs align with user-specific cognitive preferences inferred from interaction patterns, aiming to be more discriminative than generic QA metrics.
  • The work includes released code on GitHub for applying the benchmark and related evaluation methods.

Abstract

While LLMs have demonstrated remarkable potential in Question Answering (QA), evaluating personalization remains a critical bottleneck. Existing paradigms predominantly rely on lexical-level similarity or manual heuristics, often lacking sufficient data-driven validation. We address this by mining Community-Individual Preference Divergence (CIPD), where individual choices override consensus, to distill six key personalization factors as evaluative dimensions. Accordingly, we introduce CoPA, a benchmark with 1,985 user profiles for fine-grained, factor-level assessment. By quantifying the alignment between model outputs and user-specific cognitive preferences inferred from interaction patterns, CoPA provides a more comprehensive and discriminative standard for evaluating personalized QA than generic metrics. The code is available at https://github.com/bjzgcai/CoPA.