Beyond Expected Information Gain: Stable Bayesian Optimal Experimental Design with Integral Probability Metrics and Plug-and-Play Extensions

April 23, 20262604.21849

stat.MLcs.LGmath.NAstat.CO

TLDR

This paper introduces an IPM-based Bayesian Optimal Experimental Design framework that improves stability and accuracy over traditional EIG methods.

Key contributions

Proposes an IPM-based BOED framework replacing KL divergence with Integral Probability Metrics (e.g., Wasserstein).
Establishes theoretical guarantees for IPM-based utilities, showing stronger stability under model error and prior misspecification.
Empirically demonstrates that IPM-based designs produce highly concentrated credible sets.
Extends the framework to high-dimensional problems using neural optimal transport, outperforming conventional methods.

Why it matters

This work addresses fundamental limitations of classical Bayesian Optimal Experimental Design, which struggles with stability and accuracy in complex settings. By leveraging Integral Probability Metrics, it offers a more robust and flexible approach. This advancement is crucial for resource-constrained decision-making, especially in high-dimensional and sensitive applications.

Original Abstract

Bayesian Optimal Experimental Design (BOED) provides a rigorous framework for decision-making tasks in which data acquisition is often the critical bottleneck, especially in resource-constrained settings. Traditionally, BOED typically selects designs by maximizing expected information gain (EIG), commonly defined through the Kullback-Leibler (KL) divergence. However, classical evaluation of EIG often involves challenging nested expectations, and even advanced variational methods leave the underlying log-density-ratio objective unchanged. As a result, support mismatch, tail underestimation, and rare-event sensitivity remain intrinsic concerns for KL-based BOED. To address these fundamental bottlenecks, we introduce an IPM-based BOED framework that replaces density-based divergences with integral probability metrics (IPMs), including the Wasserstein distance, Maximum Mean Discrepancy, and Energy Distance, resulting in a highly flexible plug-and-play BOED framework. We establish theoretical guarantees showing that IPM-based utilities provide stronger geometry-aware stability under surrogate-model error and prior misspecification than classical EIG-based utilities. We also validate the proposed framework empirically, demonstrating that IPM-based designs yield highly concentrated credible sets. Furthermore, by extending the same sample-based BOED template in a plug-and-play manner to geometry-aware discrepancies beyond the IPM class, illustrated by a neural optimal transport estimator, we achieve accurate optimal designs in high-dimensional settings where conventional nested Monte Carlo estimators and advanced variational methods fail.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers