DP-SGD Blocks Gradient Reconstruction; PDP Fails

Defenses

Published: Tue, Oct 28, 2025 • By Lydia Stratus

DP-SGD Blocks Gradient Reconstruction; PDP Fails

Researchers test gradient leakage attacks in federated learning and evaluate two differential privacy methods. They find DP-SGD (differential privacy with stochastic gradient descent) meaningfully reduces reconstructive leakage but lowers model accuracy. A PDP-SGD variant preserves accuracy yet fails to stop reconstruction. The work stresses empirical validation and adding measures such as secure aggregation.

Federated Learning (FL) promises to train models without moving raw data. That promise has an ugly corner: shared model updates can leak private information. The paper under discussion measures how well two Differential Privacy (DP) techniques resist Gradient Leakage Attacks (GLAs) in a simulated FL setting. The experiments are simple but revealing: DP-SGD reduces the ability of an attacker to reconstruct private inputs from intercepted gradients, while a regularisation-based variant called PDP-SGD does not.

What the study tested

The authors trained a handful of vision models on a binary subset of a food dataset and, separately, ran reconstruction attacks on MNIST. They compared three training regimes: standard non-private training, DP-SGD (noise and clipping added to gradients as in Opacus), and PDP-SGD (explicit regularisation designed to mimic noise effects). Privacy budgets explored for DP-SGD included epsilon values that range from moderately strict to very permissive; delta was chosen as roughly one over the training set size. For attacks they used a gradient-matching optimiser that tries to recreate an input image from a single intercepted gradient.

Results are straightforward. DP-SGD meaningfully degrades reconstruction quality: recovered images are noisy and have low structural similarity to originals, indicating a practical barrier to GLAs. That protection does come with a cost. Models trained with DP-SGD lose some classification accuracy, and stricter privacy (lower epsilon) worsens utility. PDP-SGD often preserves or even improves accuracy for simple architectures, but it fails as a practical defence — reconstructions from PDP-SGD-protected gradients look similar to those from non-private training. The paper also notes instability in attacks: larger, multi-channel images and more complex architectures make reconstruction harder in practice, so attack success depends on task and model.

There are important caveats. The experiments use limited datasets, frozen-backbone setups, and single-gradient interceptions rather than full aggregation flows. That means results are informative but not universally transferable to every production FL deployment.

For ops and security teams this translates into clear, actionable guidance: do not accept theoretical privacy claims without empirical tests. If you use FL where data sensitivity matters, start with DP-SGD and a realistic privacy budget, expect utility trade-offs, and measure leakage directly. Complement DP with protocol-level protections such as secure aggregation so no single party sees raw updates.

Practical run-book (short): 1) Threat-model your FL use-case and decide acceptable privacy budgets, 2) Prototype DP-SGD at expected scale and measure utility versus reconstruction risk, 3) Add secure aggregation and replay-resistant transport so gradients cannot be intercepted in the clear, 4) Re-evaluate after model or data changes. Think of this as iterative: tuning epsilon and deployment architecture should follow tests, not paperwork.

The takeaway is modest and useful. DP-SGD works as an empirical mitigator against GLAs in the paper's settings but costs accuracy. PDP-SGD keeps accuracy and does not stop reconstruction. Combine DP with aggregation and continuous empirical validation before trusting FL with sensitive data.

Additional analysis of the original ArXiv paper

📋 Original Paper Title and Abstract

Differential Privacy: Gradient Leakage Attacks in Federated Learning Environments

Authors: Miguel Fernandez-de-Retana, Unai Zulaika, Rubén Sánchez-Corcuera, and Aitor Almeida

Federated Learning (FL) allows for the training of Machine Learning models in a collaborative manner without the need to share sensitive data. However, it remains vulnerable to Gradient Leakage Attacks (GLAs), which can reveal private information from the shared model updates. In this work, we investigate the effectiveness of Differential Privacy (DP) mechanisms - specifically, DP-SGD and a variant based on explicit regularization (PDP-SGD) - as defenses against GLAs. To this end, we evaluate the performance of several computer vision models trained under varying privacy levels on a simple classification task, and then analyze the quality of private data reconstructions obtained from the intercepted gradients in a simulated FL environment. Our results demonstrate that DP-SGD significantly mitigates the risk of gradient leakage attacks, albeit with a moderate trade-off in model utility. In contrast, PDP-SGD maintains strong classification performance but proves ineffective as a practical defense against reconstruction attacks. These findings highlight the importance of empirically evaluating privacy mechanisms beyond their theoretical guarantees, particularly in distributed learning scenarios where information leakage may represent an unassumable critical threat to data security and privacy.

🔍 ShortSpan Analysis of the Paper

Problem

Federated learning enables collaborative model training without sharing raw data, yet shared updates can leak private information through gradient leakage attacks. The paper evaluates differential privacy mechanisms as defenses against such attacks, focusing on DP-SGD and a variant based on explicit regularisation (PDP-SGD). The aim is to determine how these protections affect both the risk of reconstructing private data from intercepted gradients and the utility of trained models in a practical distributed learning setting.

Approach

Experiments use three computer vision models trained on a binary food classification task derived from an outcome balanced subset of Food 101: a simple Custom-CNN trained from scratch, a ResNet50 pretrained on ImageNet with the backbone frozen and only a final head trained, and a Vision Transformer based on DINOv2 with Registers with a frozen backbone. Training included standard, DP-SGD and PDP-SGD regimes. For DP-SGD, privacy budgets of ε 8, 25 and 50 were explored with δ approximately 1 divided by the training data size, using the Opacus library. PDP-SGD implemented explicit regularisation with a regularisation constant κ equal to η2 σ2, where η is the learning rate and σ the noise multiplier. Aiming to assess realism, coding used image sizes of 224 by 224, batch size 32, initial learning rate 0.001, and early stopping after 20 consecutive non improving epochs. Validation used a 20 per cent holdout. A second component simulates a gradient leakage attack on MNIST with a small CNN, where the attacker jointly optimises a reconstructed image and label via a gradient matching objective, using an L-BFGS style optimiser. The attack assumes interception of a single gradient and evaluates the reconstruction quality via SSIM and related metrics. The study also considers how robust the reconstruction is to the presence of DP in the victim model.

Key Findings

DP-SGD substantially reduces gradient leakage risk, with reconstruction attempts yielding noisy outputs and very low structural similarity to the original images, indicating effective protection against GLAs at the cost of some loss in model utility.
PDP-SGD preserves classification performance more than DP-SGD in many cases, but it fails as a practical defence against reconstruction attacks, with reconstructed images from PDP-SGD closely resembling those from unprotected training.
In classification tasks, DP-SGD generally degrades performance compared with non privacy training, and the degradation grows with stricter privacy budgets, though some simple architectures show anomalous results where DP-SGD matches or slightly exceeds baseline metrics. PDP-SGD often improves metrics for the simple CNN, and can slightly improve performance for some pretrained models, but its benefits diminish for highly capable architectures.
For the GLA simulations, DP-SGD stops meaningful reconstructions, effectively preventing data recovery from gradients, while PDP-SGD does not provide a practical barrier, with reconstructions similar to those from no privacy.
Attacks exhibit instability and convergence challenges, particularly for larger or multi channel images, underscoring that the effectiveness of gradient leakage methods can depend on task complexity and model architecture.
Overall the work highlights a gap between theoretical privacy guarantees and practical resilience to attack vectors in distributed learning, underscoring the need for stronger or additional protections such as secure aggregation alongside empirical evaluation of privacy mechanisms.

Limitations

The experiments are conducted on a limited set of models and datasets, with MNIST used for the reconstruction attacks and a binary subset of Food 101 for classification. Results may not generalise to larger, multi channel images or more complex tasks. The gradient leakage simulations focus on a single gradient rather than full federated aggregation, which may differ in practice. The effectiveness of PDP-SGD as a privacy mechanism appears context dependent and requires further investigation to understand its vulnerabilities and boundaries in real world deployments.

Why It Matters

The findings demonstrate that private data can be reconstructed from shared model updates in federated learning, exposing a tangible vulnerability in distributed AI systems. They show that DP-SGD can meaningfully reduce this leakage but at the cost of model utility, while PDP-SGD may preserve accuracy yet fail to protect against reconstruction attacks. This underlines the importance of empirically validating privacy mechanisms beyond theoretical guarantees and supports the push for stronger practical protections, such as secure aggregation, alongside careful privacy budgeting. The work carries societal implications for privacy, surveillance risk, and data protection in sensitive domains where distributed AI is deployed.

Attribution Original paper on arXiv