Portfolio Optimization Proxies under Label Scarcity and Regime Shifts via Bayesian and Deterministic Students under Semi-Supervised Sandwich Training

arXiv cs.LG / 4/17/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces a machine-learning-assisted portfolio optimization framework aimed at handling label scarcity and uncertain market regimes.
  • It uses a teacher–student pipeline where a CVaR optimizer produces supervisory labels, and Bayesian and deterministic neural models are trained with both real data (104 labeled observations) and synthetic data.
  • Synthetic training data is created via a factor-based model with t-copula residuals to expand effective training beyond the small labeled sample.
  • The authors evaluate four student models across controlled synthetic experiments, in-distribution real-market testing, and cross-universe generalization, using a rolling evaluation protocol with periodic fine-tuning and reset for stability.
  • Results indicate the student models can match or outperform the CVaR teacher in multiple scenarios, with better robustness to regime shifts and lower portfolio turnover.

Abstract

This paper proposes a machine learning assisted portfolio optimization framework designed for low data environments and regime uncertainty. We construct a teacher student learning pipeline in which a Conditional Value at Risk (CVaR) optimizer generates supervisory labels, and neural models (Bayesian and deterministic) are trained using both real and synthetically augmented data. The synthetic data is generated using a factor based model with t copula residuals, enabling training beyond the limited real sample of 104 labeled observations. We evaluate four student models under a structured experimental framework comprising (i) controlled synthetic experiments (3 x 5 seed grid), (ii) in-distribution real market evaluation (C2A) and (iii) cross-universe generalization (D2A). In real-market settings, models are deployed using a rolling evaluation protocol where a frozen pretrained model is periodically fine tuned on recent observations and reset to its base state, ensuring stability while allowing limited adaptation. Results show that student models can match or outperform the CVaR teacher in several settings, while achieving improved robustness under regime shifts and reduced turnover. These findings suggest that hybrid optimization learning approaches can enhance portfolio construction in data constrained environments