First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution
arXiv cs.AI / 3/25/2026
💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- The paper identifies “first-mover bias” in gradient boosting explanations as a mechanistic, path-dependent concentration of feature importance caused by sequential residual fitting when correlated features compete for early splits.
- It explains that the feature chosen first gains a self-reinforcing advantage because later trees inherit residuals that favor the incumbent, leading SHAP-based rankings to become unstable under multicollinearity.
- The authors show that scaling to a “Large Single Model” (with the same total tree count) produces the worst SHAP explanation stability among tested workflows, making the bias more pronounced in that setting.
- They demonstrate that breaking the sequential dependency via model independence resolves the issue in both linear regimes and remains the most effective mitigation under nonlinear data-generating processes.
- Two approaches—DASH (Diversified Aggregation of SHAP) and simple seed-averaging (Stochastic Retrain)—restore stability (e.g., at ρ=0.9, stability reaches 0.977 for both), and the paper also introduces diagnostic tools (FSI and IS Plot) to detect the bias without ground truth.
Related Articles
Build a WhatsApp AI Assistant Using Laravel, Twilio and OpenAI
Dev.to
Santa Augmentcode Intent Ep.6
Dev.to

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.
Dev.to
Anthropic shut down the Claude OAuth workaround. Here's the cheapest alternative in 2026.
Dev.to
ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't
Dev.to