Talent or Luck? Evaluating Attribution Bias in Large Language Models

arXiv cs.CL / 4/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper investigates attribution bias in large language models by examining how they assign internal (e.g., effort/ability) versus external (e.g., difficulty/luck) explanations to outcomes.
It argues that LLMs’ attribution patterns linked to demographics can have fairness implications by shaping perceptions and influencing decisions.
Instead of focusing only on surface-level stereotypes, the authors propose a cognitively grounded framework to evaluate disparities in how models reason across demographic groups.
The goal is to identify how reasoning differences “channelize” bias toward particular demographic groups, providing a more principled evaluation method.
The work is presented as an updated arXiv version (v2), positioning it as a research contribution rather than a product announcement.

Abstract

When a student fails an exam, do we tend to blame their effort or the test's difficulty? Attribution, defined as how reasons are assigned to event outcomes, shapes perceptions, reinforces stereotypes, and influences decisions. Attribution Theory in social psychology explains how humans assign responsibility for events using implicit cognition, attributing causes to internal (e.g., effort, ability) or external (e.g., task difficulty, luck) factors. LLMs' attribution of event outcomes based on demographics carries important fairness implications. Most works exploring social biases in LLMs focus on surface-level associations or isolated stereotypes. This work proposes a cognitively grounded bias evaluation framework to identify how models' reasoning disparities channelize biases toward demographic groups.