Abstract: Recently, generative adversarial networks (GANs) have achieved stunning realism, fooling even human observers. Indeed, the popular tongue-in-cheek website http://thispersondoesnotexist.com, taunts users with GAN generated images that seem too real to believe. On the other hand, GANs do leak information about their training data, as evidenced by membership attacks recently demonstrated in the literature. In this work, we challenge the assumption that GAN faces really are novel creations, by constructing a successful membership attack of a new kind. Unlike previous works, our attack can accurately discern samples sharing the same identity as training samples without being the same samples. We demonstrate the interest of our attack across several popular face datasets and GAN training procedures. Notably, we show that even in the presence of significant dataset diversity, an over represented person can pose a privacy concern.
Load up the website This Person Does Not Exist and it’ll show you a human face, near-perfect in its realism yet totally fake. Refresh and the neural network behind the site will generate another, and another, and another. The endless sequence of AI-crafted faces is produced by a generative adversarial network (GAN) — a type of AI that learns to produce realistic but fake examples of the data it is trained on. But such generated faces — which are starting to be used in CGI movies and ads — might not be as unique as they seem. In a paper titled This Person (Probably) Exists (PDF), researchers show that many faces produced by GANs bear a striking resemblance to actual people who appear in the training data. The fake faces can effectively unmask the real faces the GAN was trained on, making it possible to expose the identity of those individuals. The work is the latest in a string of studies that call into doubt the popular idea that neural networks are “black boxes” that reveal nothing about what goes on inside.
To expose the hidden training data, Ryan Webster and his colleagues at the University of Caen Normandy in France used a type of attack called a membership attack, which can be used to find out whether certain data was used to train a neural network model. These attacks typically take advantage of subtle differences between the way a model treats data it was trained on — and has thus seen thousands of times before — and unseen data. For example, a model might identify a previously unseen image accurately, but with slightly less confidence than one it was trained on. A second, attacking model can learn to spot such tells in the first model’s behavior and use them to predict when certain data, such as a photo, is in the training set or not.
Such attacks can lead to serious security leaks. For example, finding out that someone’s medical data was used to train a model associated with a disease might reveal that this person has that disease. Webster’s team extended this idea so that instead of identifying the exact photos used to train a GAN, they identified photos in the GAN’s training set that were not identical but appeared to portray the same individual — in other words, faces with the same identity. To do this, the researchers first generated faces with the GAN and then used a separate facial-recognition AI to detect whether the identity of these generated faces matched the identity of any of the faces seen in the training data. The results are striking. In many cases, the team found multiple photos of real people in the training data that appeared to match the fake faces generated by the GAN, revealing the identity of individuals the AI had been trained on.