Sequential Minimal Optimization for $\varepsilon$-SVR with MAPE Loss and Sample-Dependent Box Constraints

arXiv stat.ML / 5/5/2026

💬 OpinionDeveloper Stack & InfrastructureTools & Practical UsageModels & Research

Key Points

  • The paper derives a Sequential Minimal Optimization (SMO) algorithm for ε-SVR by formulating the training objective to directly minimize Mean Absolute Percentage Error (MAPE) in the loss function.
  • It introduces sample-dependent dual box constraints for the variables (αk, αk*) of the form [0, 100C/yk], which changes the feasible sets used in working-set selection and the clipping bounds during the two-variable update.
  • Despite these constraint changes, the authors show that the curvature and gradient update structure remains essentially identical to standard SMO.
  • The paper also adapts a shrinking heuristic to the sample-dependent bounds, leading to an asymmetry between α and α* governed by the gap term 2ykε/100, and extends the same solver to a symmetric-kernel variant.
  • An open-source implementation is provided via the psvr R package, enabling practical use of the proposed optimizer.

Abstract

We derive a Sequential Minimal Optimization (SMO) algorithm for the quadratic dual problem arising from \varepsilon-SVR~\cite{Vapnik1995, Drucker1997, Smola2004} modified to minimize the Mean Absolute Percentage Error (MAPE)~\cite{Makridakis1993, Hyndman2006} directly in the loss function~\cite{benavides2025support}. This formulation is part of a broader family of SVR models with percentage-error losses that also includes least-squares variants~\cite{Suykens2002} and symmetric-kernel extensions~\cite{Espinoza2005}, whose unified structure is studied in~\cite{benavides2026unified}. The key structural difference from standard \varepsilon-SVR is that the box constraints become \emph{sample-dependent}: \alpha_k, \alpha_k^* \in [0,\, 100C/y_k]. We show that this modification affects only (i) the feasibility sets \Iup and \Idown in the working-set selection and (ii) the clipping bounds in the analytic two-variable update, while leaving the curvature formula and gradient update structurally identical to the standard SMO~\cite{Platt1998, Platt1999, Fan2005}. A shrinking heuristic adapted to the sample-dependent bounds is derived and shown to introduce an asymmetry between \alpha and \alpha^* variables controlled by the gap 2y_k\varepsilon/100. The same solver applies to the symmetric-kernel variant (m2) by replacing \Omega with \Omega_s = \tfrac{1}{2}(\Omega + a\Omega^*)~\cite{Espinoza2005}. An implementation is available in the open-source \texttt{psvr} R package~\cite{BenavidesHerrera2026Rpsvr}.