Machine learning and privacy protection: Attacks and defenses

Zari, Oualid

Thesis

mso-ansi-language:EN-US">The increasing adoption of machine learning (ML) algorithms in privacy-sensitive domains has revolutionized data analysis across numerous fields. These algorithms, including Deep Neural Networks as well as Principal Component Analysis (PCA) and Graph Neural Networks (GNNs), process vast amounts of data to extract valuable insights. The integration of these techniques into critical applications handling sensitive information, from healthcare records to social network data, has led to significant advances in data analysis and prediction capabilities. However, as these ML models become more prevalent, they inadvertently expose vulnerabilities that can compromise individual privacy through various attack vectors.

mso-ansi-language:EN-US">This thesis addresses the critical privacy challenges posed by modern ML systems, focusing particularly on membership inference attacks (MIA) and link inference attacks (LIA). We propose novel attack methodologies and corresponding defensive measures that enhance our understanding of privacy vulnerabilities while providing practical solutions for privacy preservation. We aim to design these solutions while maintaining a careful balance between privacy protection and model utility, specifically for real-world ML applications.

mso-ansi-language:EN-US">In our first contribution, we present a novel MIA against PCA, where an adversary can determine whether a specific data sample was used in computing the principal components. We demonstrate that this attack achieves high success rates, particularly when the number of samples used by PCA is relatively small. To counter this vulnerability, we investigate different approaches to implementing differential privacy mechanisms in PCA, analyzing their effectiveness in protecting against MIAs while preserving data utility. We provide comprehensive empirical evidence showing the trade-offs between privacy guarantees and model performance.

mso-ansi-language:EN-US">In our second contribution, we introduce a new link inference attack, namely, the Node Injection Link Stealing (NILS) attack against GNNs that demonstrates how an adversary can exploit the dynamic nature of GNNs by injecting malicious nodes to infer edge information. We evaluate this attack according to a new differential privacy notion dedicated to graph structures and further propose dedicated defense strategies.

mso-ansi-language:EN-US">Finally, our third contribution focuses on the distributed setting, i.e., Vertical Federated Graph Learning (VFGL) and we develop a gradient-based LIA which reveals how gradient information in FL settings can leak graph structure details. Similarly to NILS, this attack is also evaluated across various datasets and model architectures and dedicated defensive strategies based on differential privacy mechanisms are proposed.

Detail

BIBTEX

Type:

Thèse

Date:

2025-01-14

Department:

Sécurité numérique

Eurecom Ref:

8022