End-to-End Large Portfolio Optimization for Variance Minimization with Neural Networks through Covariance Cleaning

arXiv stat.ML / 4/22/2026

📰 NewsSignals & Early TrendsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper proposes a rotation-invariant neural network that learns lag-transforms of returns and marginal volatilities, while also regularizing eigenvalues of large equity covariance matrices to target minimum-variance portfolios.
  • By constructing the architecture to mirror the analytical global minimum-variance solution (rather than acting as a pure black box), the method aims to preserve interpretability of each module’s function.
  • The approach is trained end-to-end with a loss based on future short-term realized minimum variance, and it achieves lower realized volatility, smaller maximum drawdowns, and higher Sharpe ratios than strong competitors on out-of-sample data spanning Jan 2000–Dec 2024.
  • The learned covariance representation can be plugged into general optimizers to enforce long-only constraints with little loss, and the performance advantage largely persists under realistic trading frictions and during periods of market stress.
  • The model’s dimension-agnostic design allows calibration on a few hundred stocks and application without retraining to larger universes (e.g., ~1,000 US equities), indicating robust generalization.

Abstract

We develop a rotation-invariant neural network that provides the global minimum-variance portfolio by jointly learning how to lag-transform historical returns and marginal volatilities and how to regularise the eigenvalues of large equity covariance matrices. This explicit mathematical mapping offers clear interpretability of each module's role, so the model cannot be regarded as a pure black box. The architecture mirrors the analytical form of the global minimum-variance solution yet remains agnostic to dimension, so a single model can be calibrated on panels of a few hundred stocks and applied, without retraining, to one thousand US equities, a cross-sectional jump that indicates robust generalization capability. The loss function is the future short-term realized minimum variance and is optimized end-to-end on real returns. In out-of-sample tests from January 2000 to December 2024, the estimator delivers systematically lower realized volatility, smaller maximum drawdowns, and higher Sharpe ratios than the best competitors, including state-of-the-art non-linear shrinkage, and these advantages persist across both short and long evaluation horizons despite the model's training focus is short-term. Furthermore, although the model is trained end-to-end to produce an unconstrained minimum-variance portfolio, we show that its learned covariance representation can be used in general optimizers under long-only constraints with virtually no loss in its performance advantage over competing estimators. These advantages persist when the strategy is executed under a highly realistic implementation framework that models market orders at the auctions, empirical slippage, exchange fees, and financing charges for leverage, and they remain stable during episodes of acute market stress.