Attribution Bias in Large Language Models
arXiv cs.AI / 4/8/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces AttriBench, a new quote-attribution benchmark dataset that is balanced across both author fame and demographics to study attribution fairness in a controlled way.
- Evaluating 11 widely used LLMs under different prompting setups shows quote attribution remains difficult even for frontier models.
- The study finds large, systematic attribution-accuracy disparities across race, gender, and intersectional demographic groups.
- It identifies and analyzes “suppression,” a failure mode where models omit attribution entirely despite having authorship information, and shows suppression is common and uneven across demographic groups.
- The authors propose quote attribution as a benchmark for representational fairness, highlighting gaps that standard accuracy metrics can miss.
Related Articles

Meta's latest model is as open as Zuckerberg's private school
The Register

Why multi-agent AI security is broken (and the identity patterns that actually work)
Dev.to
BANKING77-77: New best of 94.61% on the official test set (+0.13pp) over our previous tests 94.48%.
Reddit r/artificial
A Comprehensive Implementation Guide to ModelScope for Model Search, Inference, Fine-Tuning, Evaluation, and Export
MarkTechPost

Harness Engineering: The Next Evolution of AI Engineering
Dev.to