Integrating Feature Correlation in Differential Privacy with Applications in DP-ERM
Tianyu Wang, Luhao Zhang, Rachel Cummings
TLDR
CorrDP introduces a relaxed differential privacy framework that accounts for feature correlations, improving utility for insensitive features in DP-ERM.
Key contributions
- Proposes a relaxed DP definition accounting for sensitive/insensitive feature heterogeneity.
- Introduces `CorrDP` framework to relax privacy for insensitive features, quantifying correlation via total variation distance.
- Develops DP-ERM algorithms under `CorrDP` using distance-dependent noise for enhanced utility guarantees.
- Demonstrates `CorrDP` significantly outperforms standard DP in experiments with insensitive features.
Why it matters
Standard differential privacy often over-protects insensitive features, leading to unnecessary utility loss. This paper offers a practical solution by introducing `CorrDP`, which intelligently differentiates between feature types. By accounting for correlations, it significantly improves the utility of DP-ERM algorithms without compromising privacy for sensitive data.
Original Abstract
Standard differential privacy imposes uniform privacy constraints across all features, overlooking the inherent distinction between sensitive and insensitive features in practice. In this paper, we introduce a relaxed definition of differential privacy that accounts for such heterogeneity, allowing certain features to be treated as insensitive even when correlated with sensitive ones. We propose a correlation-aware framework, $\textsf{CorrDP}$, which relaxes privacy for insensitive features while accounting for their correlations with sensitive features, with the correlations quantified using total variation distance. We design algorithms for differentially private empirical risk minimization (DP-ERM) under the $\textsf{CorrDP}$ framework, incorporating distance-dependent noise into gradients for improved theoretical utility guarantees. When the correlation distance is unknown, we estimate it from the dataset and show that it achieves a comparable privacy-utility guarantee. We perform experiments on synthetic and real-world datasets and show that $\textsf{CorrDP}$-based DP-ERM algorithms consistently outperform the standard DP framework in the presence of insensitive features.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.