Linear Discriminant Analysis with Gradient Optimization

arXiv stat.ML / 4/6/2026

💬 OpinionSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces “LDA with Gradient Optimization” (LDA-GO), a new approach to linear discriminant analysis designed for high-dimensional classification and dimension reduction where standard covariance estimation is unreliable.
  • LDA-GO learns a low-rank precision matrix using scalable gradient-based optimization, while avoiding quadratic-sized intermediate computations so each optimization step scales linearly with dimensionality.
  • It automatically chooses between a Gaussian likelihood and a cross-entropy loss via data-driven structural diagnostics, reducing the need for manual tuning and adapting to different signal structures.
  • The authors provide theoretical results including convexity of the objective functions, Bayes-optimality, and a finite-sample excess error bound.
  • Experiments on simulated and real datasets show LDA-GO outperforming many LDA variants, with particular gains in sparse-signal, high-dimensional regimes.

Abstract

Linear discriminant analysis (LDA) is a fundamental classification and dimension reduction method that achieves Bayes optimality under Gaussian mixture, but often struggles in high-dimensional settings where the covariance matrix cannot be reliably estimated. We propose LDA with gradient optimization (LDA-GO), which learns a low-rank precision matrix via scalable gradient-based optimization. The method automatically selects between a Gaussian likelihood and a cross-entropy loss using data-driven structural diagnostics, adapting to the signal structure without user tuning. The gradient computation avoids any quadratic-sized intermediate matrix, keeping the per-iteration cost linear in the number of dimensions. Theoretically, we prove several properties of the method, including the convexity of the objective functions, Bayes-optimality of the method, and a finite-sample bound of the excess error. Numerically, we conducted a variety of simulations and real data experiments to show that LDA-GO wins a majority of settings among other LDA variants, particularly in sparse-signal high-dimensional regimes.