Sparse Learning and Class Probability Estimation with Weighted Support Vector Machines

arXiv stat.ML / 3/25/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper addresses classification and class probability estimation in sparse feature settings where existing 2-norm regularized weighted SVMs can struggle due to redundant noise.
  • It proposes new weighted SVM (wSVM) frameworks that perform automatic variable selection alongside reliable probability estimation, using 1-norm or elastic net regularization.
  • The method estimates class probabilities either by applying an 2-regularized wSVM on the selected variables or by directly using an elastic net regularized wSVM.
  • The authors report that elastic net regularized wSVMs outperform the alternatives for both variable selection and probability estimation, offering variable grouping benefits but requiring additional computation in high-dimensional cases.
  • They discuss extending the approach to K-class problems via ensemble learning, keeping the framework broadly applicable to real-world domains like biology and medicine.

Abstract

Classification and probability estimation are fundamental tasks with broad applications across modern machine learning and data science, spanning fields such as biology, medicine, engineering, and computer science. Recent development of weighted Support Vector Machines (wSVMs) has demonstrated considerable promise in robustly and accurately predicting class probabilities and performing classification across a variety of problems (Wang et al., 2008). However, the existing framework relies on an \ell^2-norm regularized binary wSVMs optimization formulation, which is designed for dense features and exhibits limited performance in the presence of sparse features with redundant noise. Effective sparse learning thus requires prescreening of important variables for each binary wSVM to ensure accurate estimation of pairwise conditional probabilities. In this paper, we propose a novel class of wSVMs frameworks that incorporate automatic variable selection with accurate probability estimation for sparse learning problems. We developed efficient algorithms for variable selection by solving either the \ell^1-norm or elastic net regularized wSVMs optimization problems. Class probability is then estimated either via the \ell^2-norm regularized wSVMs framework applied to the selected variables, or directly through elastic net regularized wSVMs. The two-step approach offers a strong advantage in simultaneous automatic variable selection and reliable probability estimators with competitive computational efficiency. The elastic net regularized wSVMs achieve superior performance in both variable selection and probability estimation, with the added benefit of variable grouping, at the cost of increases compensation time for high dimensional settings. The proposed wSVMs-based sparse learning methods are broadly applicable and can be naturally extended to K-class problems through ensemble learning.