ArXiv TLDR

Safe, Scalable, and Accurate Bayes Posterior Sampling for Large-Data Generalized Linear Mixed Models

🐦 Tweet
2604.26029

Youngsoo Baek, Samuel I. Berchuck

stat.MEstat.COstat.ML

TLDR

Introduces a novel stochastic mirror Langevin dynamics for safe, scalable, and accurate Bayesian posterior sampling in large-data GLMMs.

Key contributions

  • Proposes a novel stochastic mirror Langevin dynamics (SMLD) for Bayesian GLMMs.
  • Offers concrete guidelines for implementing SMLD in Bayesian inference frameworks.
  • Introduces a post-processing step to correct posterior variance estimation bias from subsampling.
  • Validates the method on simulations and a longitudinal study of pain trajectories.

Why it matters

Existing stochastic gradient methods struggle with large-scale Bayesian GLMMs, particularly for covariance parameters. This paper introduces a robust and scalable sampling algorithm, enabling accurate posterior inference for complex models on big data. It significantly advances the reliability of Bayesian analysis in large-data settings.

Original Abstract

We consider the problem of scalable sampling algorithms to fit Bayesian generalized linear mixed models on large datasets. Stochastic gradient Langevin dynamics, coupled with smooth re-parameterizations of variance parameters, produces divergent Markov chains and cannot be reliably used for sampling covariance parameters of random effects. We advocate the use of a mirror Langevin dynamics algorithm, propose the novel stochastic mirror Langevin dynamics based on data subsampling, and provide concrete guidelines for its use in a Bayesian inference framework. Based on an explicit Wasserstein distance error bound between the posterior and its algorithmic approximation, we propose a post-processing step that yields an asymptotic, order-wise correct estimation of the posterior variance, eliminating the irreducible posterior variance estimation bias due to subsampling. Empirical performance of the method is evaluated through simulated experiments and a longitudinal study of pain trajectories in a study of breast cancer survivors.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.