ArXiv TLDR

Integrating Feature Correlation in Differential Privacy with Applications in DP-ERM

🐦 Tweet
2605.03945

Tianyu Wang, Luhao Zhang, Rachel Cummings

cs.LGstat.ML

TLDR

CorrDP introduces a relaxed differential privacy framework that accounts for feature correlations, improving utility for insensitive features in DP-ERM.

Key contributions

  • Proposes a relaxed DP definition accounting for sensitive/insensitive feature heterogeneity.
  • Introduces `CorrDP` framework to relax privacy for insensitive features, quantifying correlation via total variation distance.
  • Develops DP-ERM algorithms under `CorrDP` using distance-dependent noise for enhanced utility guarantees.
  • Demonstrates `CorrDP` significantly outperforms standard DP in experiments with insensitive features.

Why it matters

Standard differential privacy often over-protects insensitive features, leading to unnecessary utility loss. This paper offers a practical solution by introducing `CorrDP`, which intelligently differentiates between feature types. By accounting for correlations, it significantly improves the utility of DP-ERM algorithms without compromising privacy for sensitive data.

Original Abstract

Standard differential privacy imposes uniform privacy constraints across all features, overlooking the inherent distinction between sensitive and insensitive features in practice. In this paper, we introduce a relaxed definition of differential privacy that accounts for such heterogeneity, allowing certain features to be treated as insensitive even when correlated with sensitive ones. We propose a correlation-aware framework, $\textsf{CorrDP}$, which relaxes privacy for insensitive features while accounting for their correlations with sensitive features, with the correlations quantified using total variation distance. We design algorithms for differentially private empirical risk minimization (DP-ERM) under the $\textsf{CorrDP}$ framework, incorporating distance-dependent noise into gradients for improved theoretical utility guarantees. When the correlation distance is unknown, we estimate it from the dataset and show that it achieves a comparable privacy-utility guarantee. We perform experiments on synthetic and real-world datasets and show that $\textsf{CorrDP}$-based DP-ERM algorithms consistently outperform the standard DP framework in the presence of insensitive features.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.