Federated Learning with Differential Privacy: Balancing Model Performance and Data Protection in Distributed AI Systems
DOI:
https://doi.org/10.65021/mwsj.v1.i1.1Keywords:
Federated Learning, Differential Privacy, Privacy-Preserving, Machine Learning,, Distributed Systems, Data ProtectionAbstract
As machine learning systems become increasingly prevalent in privacy-sensitive domains, the need for training high-performance models while preserving individual privacy has become paramount. This paper presents a comprehensive analysis of federated learning combined with differential privacy mechanisms, addressing the fundamental tension between model utility and privacy protection. We propose an adaptive noise calibration framework that dynamically adjusts privacy parameters based on model convergence patterns and client heterogeneity. Through extensive experiments on benchmark datasets, we demonstrate that our approach achieves superior privacy-utility trade-offs compared to existing methods, maintaining competitive model accuracy while providing strong theoretical privacy guarantees. Our results show that careful calibration of differential privacy parameters can reduce the performance degradation typically associated with privacy-preserving federated learning from 15-20% to 5-8% across various machine learning tasks.
Downloads
References
1. Chen, X., & Zhang, Y. (2023). Big data analytics in the era of distributed computing: Challenges and opportunities. IEEE Transactions on Big Data, 9(2), 456-472.
2. Voigt, P., & Von dem Bussche, A. (2017). The EU general data protection regulation (GDPR): A practical guide. Springer International Publishing. https://doi.org/10.1007/978-3-319-57959-7
3. European Union. (2016). General data protection regulation. Official Journal of the European Union, L119, 1-88.
4. Politou, E., Alepis, E., & Patsakis, C. (2018). Forgetting personal data and revoking consent under the GDPR: Challenges and proposed solutions. Journal of Cybersecurity, 4(1), tyy001. https://doi.org/10.1093/cybsec/tyy001
5. McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Artificial Intelligence and Statistics, 1273-1282.
6. Zhu, L., Liu, Z., & Han, S. (2019). Deep leakage from gradients. Advances in Neural Information Processing Systems, 32, 14774-14784.
7. Geiping, J., Bauermeister, H., Dröge, H., & Moeller, M. (2020). Inverting gradients-how easy is it to break privacy in federated learning? Advances in Neural Information Processing Systems, 33, 16937-16947.
8. Dwork, C. (2006). Differential privacy. International Colloquium on Automata, Languages, and Programming, 1-12. https://doi.org/10.1007/11787006_1
9. Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... & Zhao, S. (2021). Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1-2), 1-210. https://doi.org/10.1561/2200000083
10. Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 37(3), 50-60. https://doi.org/10.1109/MSP.2020.2975749
11. Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., & Zhang, L. (2016). Deep learning with differential privacy. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 308-318. https://doi.org/10.1145/2976749.2978318
12. Wang, H., Yurochkin, M., Sun, Y., Papailiopoulos, D., & Khazaeni, Y. (2020). Federated learning with matched averaging. International Conference on Learning Representations.
13. Li, T., Sahu, A. K., Zaheer, M., Sanjabi, M., Talwalkar, A., & Smith, V. (2020). Federated optimization in heterogeneous networks. Machine Learning and Systems, 2, 429-450.
14. Karimireddy, S. P., Kale, S., Mohri, M., Reddi, S., Stich, S., & Suresh, A. T. (2020). SCAFFOLD: Stochastic controlled averaging for federated learning. International Conference on Machine Learning, 5132-5143.
15. Wang, K., Mathews, R., Kiddon, C., Eichner, H., Beaufays, F., & Ramage, D. (2019). Federated evaluation of on-device personalization. arXiv preprint arXiv:1910.10252.
16. Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., & Chandra, V. (2018). Federated learning with non-IID data. arXiv preprint arXiv:1806.00582.
17. Konečný, J., McMahan, H. B., Ramage, D., & Richtárik, P. (2016). Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527.
18. Caldas, S., Duddu, S. M. K., Wu, P., Li, T., Konečný, J., McMahan, H. B., ... & Talwalkar, A. (2018). LEAF: A benchmark for federated settings. arXiv preprint arXiv:1812.01097.
19. Hsu, T. M., Qi, H., & Brown, M. (2019). Measuring the effects of non-identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335.
20. Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., & Naor, M. (2006). Our data, ourselves: Privacy via distributed noise generation. Annual International Conference on the Theory and Applications of Cryptographic Techniques, 486-503. https://doi.org/10.1007/11761679_29
21. Dwork, C., McSherry, F., Nissim, K., & Smith, A. (2006). Calibrating noise to sensitivity in private data analysis. Theory of Cryptography Conference, 265-284. https://doi.org/10.1007/11681878_14
22. Dwork, C., & Roth, A. (2014). The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science, 9(3-4), 211-407. https://doi.org/10.1561/0400000042
23. Bu, Z., Dong, J., Long, Q., & Su, W. J. (2020). Deep learning with Gaussian differential privacy. Harvard Data Science Review, 2(3). https://doi.org/10.1162/99608f92.bfa26492
24. Papernot, N., Abadi, M., Erlingsson, U., Goodfellow, I., & Talwar, K. (2017). Semi-supervised knowledge transfer for deep learning from private training data. International Conference on Learning Representations.
25. Lee, J., & Kifer, D. (2018). Concentrated differentially private gradient descent with adaptive per-iteration privacy budget. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 1656-1665. https://doi.org/10.1145/3219819.3220057
26. Bun, M., & Steinke, T. (2016). Concentrated differential privacy: Simplifications, extensions, and lower bounds. Theory of Cryptography Conference, 635-658. https://doi.org/10.1007/978-3-662-53641-4_24
27. Zhao, B., Mopuri, K. R., & Bilen, H. (2020). iDLG: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610.
28. Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., McMahan, H. B., Patel, S., ... & Seth, K. (2017). Practical secure aggregation for privacy-preserving machine learning. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 1175-1191. https://doi.org/10.1145/3133956.3133982
29. Geyer, R. C., Klein, T., & Nabi, M. (2017). Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557.
30. McMahan, H. B., Ramage, D., Talwar, K., & Zhang, L. (2017). Learning differentially private recurrent language models. International Conference on Learning Representations.
31. Wei, K., Li, J., Ding, M., Ma, C., Yang, H. H., Farokhi, F., ... & Poor, H. V. (2020). Federated learning with differential privacy: Algorithms and performance analysis. IEEE Transactions on Information Forensics and Security, 15, 3454-3469. https://doi.org/10.1109/TIFS.2020.2988575
32. Naseri, M., Hayes, J., & De Cristofaro, E. (2022). Local and central differential privacy for robustness and privacy in federated learning. Network and Distributed System Security Symposium. https://doi.org/10.14722/ndss.2022.24055
33. Truex, S., Baracaldo, N., Anwar, A., Steinke, T., Ludwig, H., Zhang, R., & Zhou, Y. (2019). A hybrid approach to privacy-preserving federated learning. Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, 1-11. https://doi.org/10.1145/3338501.3357370
34. Ghazi, B., Golowich, N., Kumar, R., Musco, C., & Pai, G. (2023). Sample complexity of offline distributionally robust linear regression. Advances in Neural Information Processing Systems, 36.
35. Cheu, A., Smith, A., Ullman, J., Zeber, D., & Zhilyaev, M. (2019). Distributed differential privacy via shuffling. Annual International Conference on the Theory and Applications of Cryptographic Techniques, 375-403. https://doi.org/10.1007/978-3-030-17653-2_13
36. Bittau, A., Erlingsson, U., Maniatis, P., Mironov, I., Raghunathan, A., Lie, D., ... & Aggarwal, V. (2017). PROCHLO: Strong privacy for analytics in the crowd. Proceedings of the 26th Symposium on Operating Systems Principles, 441-459. https://doi.org/10.1145/3132747.3132769
37. Feldman, V., & Zrnic, T. (2021). Individual privacy accounting via a Rényi filter. Advances in Neural Information Processing Systems, 34, 23850-23861.
38. Dong, J., Roth, A., & Su, W. J. (2022). Gaussian differential privacy. Journal of the Royal Statistical Society: Series B, 84(1), 3-37. https://doi.org/10.1111/rssb.12454
39. Ligett, K., Neel, S., Roth, A., Waggoner, B., & Wu, S. Z. (2017). Accuracy first: Selecting a differential privacy level for accuracy constrained ERM. Journal of Privacy and Confidentiality, 7(3). https://doi.org/10.29012/jpc.v7i3.648
40. LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. https://doi.org/10.1109/5.726791
41. Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Technical report, University of Toronto.
42. Bottou, L., Curtis, F. E., & Nocedal, J. (2018). Optimization methods for large-scale machine learning. SIAM Review, 60(2), 223-311. https://doi.org/10.1137/16M1080173
43. Lyu, L., Yu, H., & Yang, Q. (2020). Threats to federated learning: A survey. arXiv preprint arXiv:2003.02133.
44. Bhagoji, A. N., Chakraborty, S., Mittal, P., & Calo, S. (2019). Analyzing federated learning through an adversarial lens. International Conference on Machine Learning, 634-643.
45. Li, T., Sanjabi, M., Beirami, A., & Smith, V. (2020). Fair resource allocation in federated learning. International Conference on Learning Representations.
46. Jagielski, M., Ullman, J., & Oprea, A. (2020). Auditing differentially private machine learning: How private is private SGD? Advances in Neural Information Processing Systems, 33, 22205-22216.
47. Mohassel, P., & Zhang, Y. (2017). SecureML: A system for scalable privacy-preserving machine learning. 2017 IEEE Symposium on Security and Privacy, 19-38. https://doi.org/10.1109/SP.2017.12
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Milky Way Scientific Journal

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Authors retain copyright of their work and grant the Milky Way Scientific Journal (MWSJ) the right of first publication. All published articles are licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly cited.