CLVAE: A Variational Autoencoder for Long-Term Customer Revenue Forecasting
Jeffrey Näf, Riana Valera Mbelson, Markus Meierer
TLDR
CLVAE, a VAE, accurately forecasts long-term customer revenue by balancing structural stability with flexible latent representations, improving on benchmarks.
Key contributions
- Integrates customer attrition, transactions, and spending into a single unified model.
- Provides reliable long-term revenue forecasts even without contextual customer covariates.
- Flexibly incorporates rich covariates and nonlinear effects to capture complex purchase dynamics.
- Outperforms state-of-the-art benchmarks on real-world datasets for various prediction horizons.
Why it matters
CLVAE helps businesses optimize marketing by accurately forecasting customer revenue, improving campaign targeting efficiency. Researchers gain guidance on embedding domain-specific models into VAEs, enabling flexible representation learning with econometric meaning.
Original Abstract
Predicting customers' long-term revenue from sparse and irregular transaction data is central to marketing resource allocation in non-contractual settings, yet existing approaches face a trade-off. Traditional probabilistic customer base models deliver robust long-horizon forecasts by imposing strong structural assumptions, while flexible machine-learning models often require substantial training data and careful tuning. We propose a variational-autoencoder-based model that preserves the process-based likelihood of established attrition-transaction-spend models conditional on customer heterogeneity, but replaces the restrictive parametric mixing distribution with a flexible latent representation learned by encoder-decoder networks. The resulting approach (i) provides a single model for customer attrition, transactions and spending, (ii) remains reliable when contextual covariates are unavailable, and (iii) flexibly incorporates rich covariates and nonlinear effects when they are available. This design balances structural stability with the flexibility needed to capture complex purchase dynamics. Across multiple real-world datasets and prediction horizons, the proposed model improves upon the latest benchmarks. Businesses benefit directly, as a better assessment of customers' future revenues improves the efficiency of campaign targeting. For research, this work provides guidance on how to embed domain-specific models into the variational autoencoder framework, enabling flexible representation learning while retaining an econometrically meaningful process structure.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.