Occam's Razor is Only as Sharp as Your ELBO
Ethan Harvey, Michael C. Hughes
TLDR
ELBO-based model selection can lead to overfitting or underfitting, challenging its role as Occam's razor, especially with reduced-rank assumptions.
Key contributions
- ELBO-based hyperparameter learning can cause overfitting in over-parameterized regression models.
- Overfitting depends on the assumed rank of the covariance matrix in the Gaussian approximate posterior.
- Surprisingly, the true Bayesian evidence can sometimes prefer overfit models, unlike the ELBO.
- Warns practitioners that reduced-rank assumptions for tractability may impair ELBO's model selection.
Why it matters
This paper reveals that ELBO, often used for model selection, can lead to both underfitting and overfitting, challenging its role as Occam's razor. This is vital for Bayesian practitioners, as common reduced-rank assumptions for tractability can compromise model selection. It urges caution when scaling models.
Original Abstract
The marginal likelihood, also known as the evidence, is regarded as a mathematical embodiment of Occam's razor, enabling model selection that avoids overfitting. The evidence lower bound (ELBO) objective from variational inference has also been used for similar purposes. Prior work has shown that restricting the approximate posterior family via a mean-field approximation can lead the ELBO to underfit. In this paper, we show how ELBO-based hyperparameter learning in a simple over-parameterized regression model can also produce overfitting, depending on the assumed rank of the covariance matrix in a Gaussian approximate posterior. Surprisingly, among only the underfit and overfit options, Bayesian model selection via the evidence itself sometimes prefers the overfit version, while the ELBO does not. Bayesian practitioners hoping to scale to large models should be cautious about how reduced-rank assumptions needed for tractability may impact the potential for model selection.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.