Enforcing Fair Predicted Scores on Intervals of Percentiles by Difference-of-Convex Constraints

arXiv stat.ML / 4/7/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses a common fairness-mitigation tradeoff in ML by proposing “partially fair” models that enforce fairness only over a chosen percentile interval rather than across all score ranges.
  • It defines statistical metrics to quantify fairness within a specific percentile window, reflecting stakeholder concerns that may be concentrated in high- or low-percentile groups.
  • The authors introduce an in-processing training approach that casts the learning task as constrained optimization using difference-of-convex (DC) constraints.
  • They propose solving the resulting problem with an inexact difference-of-convex algorithm (IDCA) and provide complexity analysis for finding a nearly KKT point.
  • Experiments on real-world datasets indicate the method can preserve high predictive performance while delivering fairness guarantees specifically where the percentile interval is targeted.

Abstract

Fairness in machine learning has become a critical concern. Existing approaches often focus on achieving full fairness across all score ranges generated by predictive models, ensuring fairness in both high- and low-percentile populations. However, this stringent requirement can compromise predictive performance and may not align with the practical fairness concerns of stakeholders. In this work, we propose a novel framework for building partially fair machine learning models that enforce fairness only within a specific percentile interval of interest while maintaining flexibility in other regions. We introduce statistical metrics to evaluate partial fairness within a given percentile interval. To achieve partial fairness, we propose an in-processing method by formulating the model training problem as constrained optimization with difference-of-convex constraints, which can be solved by an inexact difference-of-convex algorithm (IDCA). We provide the complexity analysis of IDCA for finding a nearly KKT point. Through numerical experiments on real-world datasets, we demonstrate that our framework achieves high predictive performance while enforcing partial fairness where it matters most.