Membership inference attacks on machine learning models: analysis and mitigation

Thumbnail Image



Journal Title

Journal ISSN

Volume Title


University of New Brunswick


Given a machine learning model and a record, membership attacks determine whether this record was used as a part of the model's training dataset. Membership inference attack can present a risk to private datasets if these datasets are used to train machine learning models and access to the resulting models is open to the public. For example, knowing that a certain patient's record was used to train a model associated with a disease can reveal that the patient has this disease. To construct attack models, multiple shadow models are created that imitate the behavior of the target model, but for which we know the training datasets and thus the ground truth about membership in these datasets. Attack models are then trained on the labeled inputs and outputs of the shadow models. There is a desideratum to conduct more analysis about this attack and accordingly to provide robust mitigation techniques that will not affect the target model's utility. In this thesis, we discussed new combinations of parameters and settings, which were not explored in the literature to provide useful insights about the behavior of the membership inference attack. We also proposed and evaluated different mitigation techniques against this type of attack considering different training algorithms of the target model. Our experiments showed that, the defense strategies mitigate the membership inference attack considerably while preserving the utility of the target model. Finally, we summarized and compared the existing mitigation techniques with our results.