Differentially Private Clustered Federated Learning with Privacy-Preserving Initialization and Normality-Driven Aggregation
Jie Xu, Haaris Mehmood, Rogier Van Dalen, Karthikeyan Saravanan, Mete Ozay
TLDR
PINA is a two-stage framework for Differentially Private Clustered Federated Learning, improving accuracy and privacy with robust initialization and aggregation.
Key contributions
- Proposes PINA, a two-stage framework for Differentially Private Clustered Federated Learning (DP-CFL).
- Uses LoRA adapters and private sketches for robust, privacy-preserving cluster centroid initialization.
- Introduces a normality-driven aggregation mechanism to improve DP-CFL convergence and robustness.
- Outperforms state-of-the-art DP-FL algorithms by 2.9% in accuracy for common privacy budgets.
Why it matters
This paper solves the challenge of combining clustered federated learning with differential privacy, crucial for secure FL with heterogeneous data. PINA's novel initialization and aggregation methods improve accuracy and robustness while providing formal privacy guarantees. This advances practical, privacy-preserving FL deployments.
Original Abstract
Federated learning (FL) enables training of a global model while keeping raw data on end-devices. Despite this, FL has shown to leak private user information and thus in practice, it is often coupled with methods such as differential privacy (DP) and secure vector sum to provide formal privacy guarantees to its participants. In realistic cross-device deployments, the data are highly heterogeneous, so vanilla federated learning converges slowly and generalizes poorly. Clustered federated learning (CFL) mitigates this by segregating users into clusters, leading to lower intra-cluster data heterogeneity. Nevertheless, coupling CFL with DP remains challenging: the injected DP noise makes individual client updates excessively noisy, and the server is unable to initialize cluster centroids with the less noisy aggregated updates. To address this challenge, we propose PINA, a two-stage framework that first lets each client fine-tune a lightweight low-rank adaptation (LoRA) adapter and privately share a compressed sketch of the update. The server leverages these sketches to construct robust cluster centroids. In the second stage, PINA introduces a normality-driven aggregation mechanism that improves convergence and robustness. Our method retains the benefits of clustered FL while providing formal privacy guarantees against an untrusted server. Extensive evaluations show that our proposed method outperforms state-of-the-art DP-FL algorithms by an average of 2.9% in accuracy for privacy budgets (epsilon in {2, 8}).
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.