CLVAE: A Variational Autoencoder for Long-Term Customer Revenue Forecasting

arXiv stat.ML / 4/27/2026

📰 NewsIdeas & Deep AnalysisModels & Research

Key Points

  • The paper introduces CLVAE, a variational autoencoder designed to forecast long-term customer revenue from sparse and irregular transaction histories in non-contractual settings.
  • It keeps the process-based likelihood structure from established attrition–transaction–spend probabilistic models, but replaces the rigid parametric mixture with a flexible latent representation learned via encoder–decoder networks.
  • The approach can produce a unified model covering attrition, transactions, and spending, and it is intended to stay reliable even when contextual covariates are missing.
  • Experiments on multiple real-world datasets and prediction horizons show improved performance over current benchmarks, with practical benefits for marketing resource allocation and campaign targeting.
  • For researchers, the work outlines how to embed domain-specific econometric process models into a variational autoencoder framework to combine interpretability with representation learning flexibility.

Abstract

Predicting customers' long-term revenue from sparse and irregular transaction data is central to marketing resource allocation in non-contractual settings, yet existing approaches face a trade-off. Traditional probabilistic customer base models deliver robust long-horizon forecasts by imposing strong structural assumptions, while flexible machine-learning models often require substantial training data and careful tuning. We propose a variational-autoencoder-based model that preserves the process-based likelihood of established attrition-transaction-spend models conditional on customer heterogeneity, but replaces the restrictive parametric mixing distribution with a flexible latent representation learned by encoder-decoder networks. The resulting approach (i) provides a single model for customer attrition, transactions and spending, (ii) remains reliable when contextual covariates are unavailable, and (iii) flexibly incorporates rich covariates and nonlinear effects when they are available. This design balances structural stability with the flexibility needed to capture complex purchase dynamics. Across multiple real-world datasets and prediction horizons, the proposed model improves upon the latest benchmarks. Businesses benefit directly, as a better assessment of customers' future revenues improves the efficiency of campaign targeting. For research, this work provides guidance on how to embed domain-specific models into the variational autoencoder framework, enabling flexible representation learning while retaining an econometrically meaningful process structure.

CLVAE: A Variational Autoencoder for Long-Term Customer Revenue Forecasting | AI Navigate