Was George Orwell right, is Big Brother watching us? Undoubtedly many are alarmed by the ever-increasing level of computer-driven surveillance, particularly involving facial recognition technologies. In the past few months, San Francisco and Oakland, California, and the US state of Massachusetts have all banned police from using facial recognition tech. Meanwhile, in Europe, The General Data Protection Regulation (GDPR) introduces restrictive rules about privacy preservation in data processing.
A team of researchers from the Norwegian University of Science and Technology recently proposed a new architecture that can anonymize faces in images automatically while the original data distribution remains uninterrupted. They introduce the method in the paper DeepPrivacy: A Generative Adversarial Network for Face Anonymization.
Since its introduction in May 2018, the GDPR has profoundly affected processing of personal data across Europe. The regulations require consent from the individual for any use of their data. If however the data cannot be identified as belonging to a particular individual, companies can use it without consent. Face anonymization that preserves existing data distribution is therefore a potential win-win solution for both companies and individuals.
The core challenge of face anonymization is to create a robust model that can remove all privacy-sensitive information and also generate a new, realistic face for data visual integrity. Individuals in pictures differ widely in terms of poses, backgrounds, and other appearance features, and so the solution should cope with all given conditional information accordingly.
The DeepPrivacy model is a conditional generative adversarial network (GAN). Researchers created a generator that never observes the original faces, and can remove all privacy-sensitive information and generate realistic anonymized faces. They simplified the model so it only requires two annotations of the face: a bounding box annotation and a sparse pose estimation. The former annotation locates the privacy-sensitive areas for further processing, and the latter uses keypoints for the ears, eyes, nose, and shoulders to complete the estimation.
Another critical component in the study is the dataset. Researchers built a new dataset of 1.47 million human faces, Flickr Diverse Faces (FDF). Each face is annotated with a bounding box and keypoints. Researchers say FDF “covers a considerably large diversity of facial poses, partial occlusions, complex backgrounds, and different persons” compared to previous facial datasets.
The secret sauce on the DeepPrivacy model is a progressive growing training technique applied to the generator and discriminator that doubles the resolution each time the network expands, from a starting resolution of 8 to 128×128, making the pose information finer with each increase in resolution.
Researchers conducted extensive qualitative and quantitative experiments to access the model’s ability to retain the original data distribution. They first anonymized the WIDER- Face dataset, then tested the face detection on the anonymized images for Average Precision (AP). The model achieved 99.3 percent of original AP. Previous anonymization techniques achieved 96.7 percent (8×8 pixelation), 90.5 percent (heavy blue), and 41.4 percent (black-out).
Although DeepPrivacy can generate mutilated images if for example the subjects are in irregular poses or the picture has difficult background information, the method has proven capabilities and high potential for securing privacy in visual data.
It’s also possible that de-identifying faces could help in downstream applications such as pedestrian detection, where public safety might be improved around roadways without identifying involved individuals from facial details.
The paper DeepPrivacy: A Generative Adversarial Network for Face Anonymization is on arXiv; and the project source code is on GitHub.
Journalist: Fangyu Cai | Editor: Michael Sarazen