Privacy Attacks via Differential Comparisons and Defenses via Backdoor Triggers

Case ID:

C16748

Unmet Need

Machine learning systems are becoming increasingly used in various industries, from banking, healthcare, to government and retail. These systems use large amounts of data to train models that can automatically make predictions based on past trends. For example, it can be used to detect fraud, diagnose cancer, or to detect ransomware attacks. Due to the widespread application of machine learning systems, these models are often shared and made publicly available to an extent. For example, there are many trained models that are made publicly available, such that anyone can download it and make queries, or predictions, without accessing the model parameters. However, recent literature shows that machine learning can be used for membership inference, which is classifying and determining whether a data instance was part of the training dataset or not. This is a significant privacy concern because some models are trained on sensitive data, such as patient health records. To address this potential privacy breach, membership inference attack models are designed to test how robust a machine learning model truly is, and defense mechanisms are developed to increase the robustness of a model against these attacks. However, the state-of-the-art attack models that are most effective rely on shadow models, which are models that closely resemble the target model, and are used to generate training data. This is highly unrealistic because most publicly available machine learning models only provide black box access, only enabling queries. State-of-the-art defense models compromise model accuracy and are ineffective for attacks with white box access to the model parameters, since the methods rely on perturbing the training process or output vectors. Thus, there is a need for a robust and effective method of modeling a membership inference attack and of defending against these attacks executed by any type of adversary.

Technology Overview

Johns Hopkins researchers invented a novel method of modeling membership inference attacks, and a robust defense approach that preserves the target model’s accuracy. The attack model, BlindMI, probes the target model to extract membership information using a novel approach called differential comparison. It was demonstrated in a publication that BlindMI outperforms state-of-the-art attacks in terms of the F1 score, and even achieves reasonable F1 scores against models with defenses, like adversarial regularization, MemGuard, and differential privacy. The novel defense method is based on perturbing the input data instance. Adversaries with white box access to the target model parameters cannot generate accurate predictions without knowledge of this secret perturbation, thus preventing the model from successful MI attacks. It was demonstrated to be the only defense to reduce the accuracy of a MI attack while preserving the test accuracy of the target model.

Contact JHU Applied Physics Laboratory Technology Transfer Office's Tech Manager, Ivy Rivlin. Ivy.Rivlin@jhuapl.edu

Stage of Development

Working Prototype.

Patent

N/A

Publication

Hui et al. Network and Distributed Systems Security (NDSS) Symposium 2021.