Shapley meets Rawls: an integrated framework for measuring and explaining unfairness

arXiv cs.LG / 3/30/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

共有:

Key Points

The paper proposes an integrated framework that uses Shapley values to both define and explain unfairness rather than treating fairness and explainability as separate topics.
It aligns this approach with standard group fairness criteria and enables estimating which input features contribute to unfairness during inference.
The authors extend the method from Shapley values to the Efficient-Symmetric-Linear (ESL) family of values to improve robustness of fairness definitions and reduce computation time.
In an example using the UCI Census Income dataset, the framework identifies features such as “Age,” “Number of hours,” and “Marital status” as drivers of gender unfairness.
The method reports faster runtimes than traditional Bootstrap tests for detecting feature contributions to unfairness.

Abstract

Explainability and fairness have mainly been considered separately, with recent exceptions trying the explain the sources of unfairness. This paper shows that the Shapley value can be used to both define and explain unfairness, under standard group fairness criteria. This offers an integrated framework to estimate and derive inference on unfairness as-well-as the features that contribute to it. Our framework can also be extended from Shapley values to the family of Efficient-Symmetric-Linear (ESL) values, some of which offer more robust definitions of fairness, and shorter computation times. An illustration is run on the Census Income dataset from the UCI Machine Learning Repository. Our approach shows that ``Age", ``Number of hours" and ``Marital status" generate gender unfairness, using shorter computation time than traditional Bootstrap tests.