Beyond Expected Information Gain: Stable Bayesian Optimal Experimental Design with Integral Probability Metrics and Plug-and-Play Extensions
Di Wu, Ling Liang, Haizhao Yang
TLDR
This paper introduces an IPM-based Bayesian Optimal Experimental Design framework that improves stability and accuracy over traditional EIG methods.
Key contributions
- Proposes an IPM-based BOED framework replacing KL divergence with Integral Probability Metrics (e.g., Wasserstein).
- Establishes theoretical guarantees for IPM-based utilities, showing stronger stability under model error and prior misspecification.
- Empirically demonstrates that IPM-based designs produce highly concentrated credible sets.
- Extends the framework to high-dimensional problems using neural optimal transport, outperforming conventional methods.
Why it matters
This work addresses fundamental limitations of classical Bayesian Optimal Experimental Design, which struggles with stability and accuracy in complex settings. By leveraging Integral Probability Metrics, it offers a more robust and flexible approach. This advancement is crucial for resource-constrained decision-making, especially in high-dimensional and sensitive applications.
Original Abstract
Bayesian Optimal Experimental Design (BOED) provides a rigorous framework for decision-making tasks in which data acquisition is often the critical bottleneck, especially in resource-constrained settings. Traditionally, BOED typically selects designs by maximizing expected information gain (EIG), commonly defined through the Kullback-Leibler (KL) divergence. However, classical evaluation of EIG often involves challenging nested expectations, and even advanced variational methods leave the underlying log-density-ratio objective unchanged. As a result, support mismatch, tail underestimation, and rare-event sensitivity remain intrinsic concerns for KL-based BOED. To address these fundamental bottlenecks, we introduce an IPM-based BOED framework that replaces density-based divergences with integral probability metrics (IPMs), including the Wasserstein distance, Maximum Mean Discrepancy, and Energy Distance, resulting in a highly flexible plug-and-play BOED framework. We establish theoretical guarantees showing that IPM-based utilities provide stronger geometry-aware stability under surrogate-model error and prior misspecification than classical EIG-based utilities. We also validate the proposed framework empirically, demonstrating that IPM-based designs yield highly concentrated credible sets. Furthermore, by extending the same sample-based BOED template in a plug-and-play manner to geometry-aware discrepancies beyond the IPM class, illustrated by a neural optimal transport estimator, we achieve accurate optimal designs in high-dimensional settings where conventional nested Monte Carlo estimators and advanced variational methods fail.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.