Convolutional Maximum Mean Discrepancy for Inference in Noisy Data

April 13, 20262604.12022

Ritwik Vashistha, Jeff M. Phillips, Abhra Sarkar, Arya Farahi

stat.MEstat.ML

TLDR

This paper introduces Convolutional Maximum Mean Discrepancy (convMMD) for robust statistical inference in data contaminated by measurement noise.

Key contributions

Introduces Convolutional MMD (convMMD) for robust inference with noisy, heteroscedastic data.
Establishes finite-sample deviation bounds for convMMD, unaffected by measurement error.
Proves an equivalence between testing under noise and kernel smoothing techniques.
Presents a consistent and asymptotically normal convMMD estimator with efficient SGD implementation.

Why it matters

Measurement noise degrades inference, and existing corrections are often costly. This paper introduces Convolutional MMD, an efficient framework for robust, distribution-free inference in noisy, heteroscedastic data, offering a practical solution for science.

Original Abstract

Modern data analyses frequently encounter settings where samples of variables are contaminated by measurement error. Ignoring measurement noise can substantially degrade statistical inference, while existing correction techniques are often computationally costly and inefficient. Recent advances in kernel methods, particularly those based on Maximum Mean Discrepancy (MMD), have enabled flexible, distribution-free inference, yet typically assume precise data and overlook contamination by measurement error. In this work, we introduce a novel framework for inference with samples corrupted by potentially heteroscedastic noise from a known distribution. Central to our approach is the convolutional MMD (convMMD), which compares distributions after noise convolution and retains metric validity under standard kernel conditions. We establish finite-sample deviation bounds that are unaffected by measurement error and prove an equivalence between testing under noise and kernel smoothing. Leveraging these insights, we introduce a convMMD-based estimator for inference with noisy, heteroscedastic observations. We establish its consistency and asymptotic normality, and provide an efficient implementation using stochastic gradient descent. We demonstrate the practical effectiveness of our approach through simulations and applications in astronomy and social sciences.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers