Trade-off Functions for DP-SGD with Subsampling based on Random Shuffling: Tight Upper and Lower Bounds

May 7, 20262605.06259

cs.LGcs.CR

TLDR

This paper provides a tight, transparent analysis of the privacy-utility trade-off for DP-SGD using random shuffling subsampling.

Key contributions

Provides a tight analysis of the privacy trade-off for DP-SGD with random shuffling subsampling using $f$-DP.
Derives transparent, closed-form bounds for the trade-off function, unlike implicit formulas from Poisson subsampling.
Introduces a new proof technique based on a generalized law of large numbers for asymptotic $O(\sqrt{E})$ privacy guarantees.
Demonstrates practical parameter settings where meaningful differential privacy is achieved with random shuffling.

Why it matters

This research offers a clearer, more interpretable understanding of privacy guarantees for DP-SGD with random shuffling. Its tight, closed-form bounds provide practical guidance for deploying differentially private models. The new asymptotic analysis also improves privacy composition over multiple epochs.

Original Abstract

We derive a tight analysis of the trade-off function for Differentially Private Stochastic Gradient Descent (DP-SGD) with subsampling based on random shuffling within the $f$-DP framework. Our analysis covers the regime $σ\geq \sqrt{3/\ln M}$, where $σ$ is the noise multiplier and $M$ is the number of rounds within a single epoch. Unlike $f$-DP analyses for Poisson subsampling, which yield non-closed implicit formulas that can be machine computed but are non-transparent, random shuffling admits a tight analysis yielding transparent and interpretable closed-form bounds. Our concrete bounds, derived via the Berry-Esseen theorem, are tight up to constant factors within the proof framework. We demonstrate worked parameter settings for a single epoch ($E=1$) with a corresponding trade-off function $\geq 1-a-δ$, that is, only $δ$ below the ideal random guessing diagonal $1-a$: For $δ= 1/100$ and $σ= 1$, roughly $M \approx 1.14\times 10^6$ rounds and $N \approx 1.14\times 10^7$ training samples suffice to achieve meaningful differential privacy. This is in contrast to recent negative results for the regime $σ\leq 1/\sqrt{2 \ln M}$. Our concrete bounds can be composed over multiple epochs leading to $δ$ having a linear in $E$ dependency, which restricts $E=O(\sqrt{M})$. To go beyond Berry--Esseen, we introduce a new proof technique based on a generalization of the law of large numbers that yields an asymptotic random guessing diagonal-limit result: if $E=c_M^2M$ with $c_M\to 0$, then the $E$-fold composed trade-off function satisfies $f^{\otimes E}(a)\to 1-a$ uniformly in $a\in[0,1]$ with $δ$ having only an $O(\sqrt{E})$ dependency. We compare this asymptotic regime with the corresponding Poisson subsampling asymptotic, and highlight the characterization of explicit convergence rates as an open question.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers