Efficient Logistic Regression with Mixture of Sigmoids

arXiv cs.LG / 4/6/2026

💬 OpinionIdeas & Deep AnalysisModels & Research

Key Points

  • The paper analyzes an Exponential Weights (EW) algorithm with an isotropic Gaussian prior for online logistic regression and shows it attains a near-optimal worst-case regret bound of order O(d log(Bn)) against the best bounded linear predictor.
  • It improves computational efficiency substantially, reducing total worst-case complexity to O(B^3 n^5) from the prior O(B^18 n^37) while keeping the same theoretical regret guarantee.
  • In the large-B regime under linear separability, the EW posterior converges (after rescaling) to a truncated standard Gaussian over the “version cone,” yielding a geometric interpretation of predictions as a solid-angle vote over separating directions.
  • The authors further derive non-asymptotic regret bounds showing that above a margin-dependent threshold, regret becomes independent of B and only grows logarithmically with the inverse margin, linking performance to margin geometry.
  • Overall, the work argues EW is both computationally tractable and geometrically adaptive for online binary classification with logistic loss.

Abstract

This paper studies the Exponential Weights (EW) algorithm with an isotropic Gaussian prior for online logistic regression. We show that the near-optimal worst-case regret bound O(d\log(Bn)) for EW, established by Kakade and Ng (2005) against the best linear predictor of norm at most B, can be achieved with total worst-case computational complexity O(B^3 n^5). This substantially improves on the O(B^{18}n^{37}) complexity of prior work achieving the same guarantee (Foster et al., 2018). Beyond efficiency, we analyze the large-B regime under linear separability: after rescaling by B, the EW posterior converges as B\to\infty to a standard Gaussian truncated to the version cone. Accordingly, the predictor converges to a solid-angle vote over separating directions and, on every fixed-margin slice of this cone, the mode of the corresponding truncated Gaussian is aligned with the hard-margin SVM direction. Using this geometry, we derive non-asymptotic regret bounds showing that once B exceeds a margin-dependent threshold, the regret becomes independent of B and grows only logarithmically with the inverse margin. Overall, our results show that EW can be both computationally tractable and geometrically adaptive in online classification.

Efficient Logistic Regression with Mixture of Sigmoids | AI Navigate