First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution

arXiv cs.AI / 3/25/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

The paper identifies “first-mover bias” in gradient boosting explanations as a mechanistic, path-dependent concentration of feature importance caused by sequential residual fitting when correlated features compete for early splits.
It explains that the feature chosen first gains a self-reinforcing advantage because later trees inherit residuals that favor the incumbent, leading SHAP-based rankings to become unstable under multicollinearity.
The authors show that scaling to a “Large Single Model” (with the same total tree count) produces the worst SHAP explanation stability among tested workflows, making the bias more pronounced in that setting.
They demonstrate that breaking the sequential dependency via model independence resolves the issue in both linear regimes and remains the most effective mitigation under nonlinear data-generating processes.
Two approaches—DASH (Diversified Aggregation of SHAP) and simple seed-averaging (Stochastic Retrain)—restore stability (e.g., at ρ=0.9, stability reaches 0.977 for both), and the paper also introduces diagnostic tools (FSI and IS Plot) to detect the bias without ground truth.

Abstract

We isolate and empirically characterize first-mover bias -- a path-dependent concentration of feature importance caused by sequential residual fitting in gradient boosting -- as a specific mechanistic cause of the well-known instability of SHAP-based feature rankings under multicollinearity. When correlated features compete for early splits, gradient boosting creates a self-reinforcing advantage for whichever feature is selected first: subsequent trees inherit modified residuals that favor the incumbent, concentrating SHAP importance on an arbitrary feature rather than distributing it across the correlated group. Scaling up a single model amplifies this effect -- a Large Single Model with the same total tree count as our method produces the worst explanations of any approach tested. We demonstrate that model independence is sufficient to resolve first-mover bias in the linear regime, and remains the most effective mitigation under nonlinear data-generating processes. Both our proposed method, DASH (Diversified Aggregation of SHAP), and simple seed-averaging (Stochastic Retrain) restore stability by breaking the sequential dependency chain, confirming that the operative mechanism is independence between explained models. At rho=0.9, both achieve stability=0.977, while the single-best workflow degrades to 0.958 and the Large Single Model to 0.938. On the Breast Cancer dataset, DASH improves stability from 0.32 to 0.93 (+0.61) against a tree-count-matched baseline. DASH additionally provides two diagnostic tools -- the Feature Stability Index (FSI) and Importance-Stability (IS) Plot -- that detect first-mover bias without ground truth, enabling practitioners to audit explanation reliability before acting on feature rankings. Software and reproducible benchmarks are available at https://github.com/DrakeCaraker/dash-shap.

Build a WhatsApp AI Assistant Using Laravel, Twilio and OpenAI

Dev.to

Santa Augmentcode Intent Ep.6

Dev.to

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

Dev.to

Anthropic shut down the Claude OAuth workaround. Here's the cheapest alternative in 2026.

Dev.to

ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't

Dev.to

First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution

Key Points

Abstract

Related Articles

Build a WhatsApp AI Assistant Using Laravel, Twilio and OpenAI

Santa Augmentcode Intent Ep.6

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

Anthropic shut down the Claude OAuth workaround. Here's the cheapest alternative in 2026.

ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer