Decomposing Discrimination: Causal Mediation Analysis for AI-Driven Credit Decisions

arXiv cs.LG / 3/31/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

Key Points

  • The paper argues that standard statistical fairness metrics in AI credit scoring mix two causally distinct pathways: direct discrimination from protected attributes to outcomes, and indirect effects via financial mediators reflecting structural inequality.
  • It formalizes discrimination decomposition using Pearl-style natural direct/indirect effects for credit decisions, focusing on identification under treatment-induced confounding where protected attributes affect both mediators and the final decision.
  • The authors show interventional direct/indirect effects (IDE/IIE) are identifiable under a weaker Modified Sequential Ignorability assumption, and that IDE/IIE can conservatively bound the otherwise-unidentified natural effects under a monotone indirect treatment response.
  • They introduce a doubly-robust augmented inverse probability weighted (AIPW) estimator with cross-fitting, plus E-value sensitivity analysis for residual direct-path confounding.
  • Using 89,465 HMDA mortgage applications from New York (2022), the study finds about 77% of a 7.9-point racial denial disparity is mediated through financially relevant features, with the remaining 23% acting as a conservative lower bound on direct discrimination, and provides an open-source CausalFair Python package for deployment.

Abstract

Statistical fairness metrics in AI-driven credit decisions conflate two causally distinct mechanisms: discrimination operating directly from a protected attribute to a credit outcome, and structural inequality propagating through legitimate financial features. We formalise this distinction using Pearl's framework of natural direct and indirect effects applied to the credit decision setting. Our primary theoretical contribution is an identification strategy for natural direct and indirect effects under treatment-induced confounding -- the prevalent setting in which protected attributes causally affect both financial mediators and the final decision, violating standard sequential ignorability. We show that interventional direct and indirect effects (IDE/IIE) are identified under the weaker Modified Sequential Ignorability assumption, and prove that IDE/IIE provide conservative bounds on the unidentified natural effects under monotone indirect treatment response. We propose a doubly-robust augmented inverse probability weighted (AIPW) estimator for IDE/IIE with semiparametric efficiency properties, implemented via cross-fitting. An E-value sensitivity analysis addresses residual confounding on the direct pathway. Empirical evaluation on 89,465 real HMDA conventional purchase mortgage applications from New York State (2022) demonstrates that approximately 77% of the observed 7.9 percentage-point racial denial disparity operates through financial mediators shaped by structural inequality, while the remaining 23% constitutes a conservative lower bound on direct discrimination. The open-source CausalFair Python package implements the full pipeline for deployment at resource-constrained financial institutions.