A fundamental concept in Chinese philosophy and culture is Yin and Yang — the belief that harmony is achieved when opposites coexist and share elements of the other. This can be interpreted to suggest that purpose and goodness can be found even in stuff like floodwaters, mosquitoes, and — in the world of artificial intelligence — adversarial examples.
Adversarial examples are perturbations added to an image that are invisible to the human eye but can trick a computer vision system into misclassifying objects — potentially causing for example an autonomous vehicle to drive through a stop sign. Adversarial examples are a bane to the researchers who build the neural networks that deliver much of today’s advanced AI.
Now, a team from Google and Johns Hopkins University says it has found a silver lining to adversarial examples. Rather than attempting to defend convolutional networks against them, the researchers introduce a novel enhanced adversarial training scheme, AdvProp, which treats adversarial examples as additional training examples to improve the accuracy of image classification models.
The researchers realized the positive potential of adversarial examples when they trained a medium-scale model called EfficientNet-B3 and a large-scale model EfficientNet-B7 on the ImageNet dataset. They found that although training exclusively on adversarial examples resulted in performance degradation, training with both adversarial example and clean images could improve network performance on clean images if the adversarial examples can be harnessed properly. This prompted the team to wonder whether it might be possible to extract valuable features from adversarial examples and use these to boost model performance.
Paper coauthor Quoc Le is the highly respected AI researcher behind Google AutoML. He tweeted that AdvProp as a “weird trick” that enables adversarial examples to reduce model overfitting. “(The) key idea is to use two BatchNorms, one for normal examples and another one for adversarial examples.”
There have been previous approaches to improving vision models by using adversarial examples during training. These jointly trained models with clean images and adversarial examples without distinction. In the new study, researchers introduced a separate batch norm for the adversarial examples to reflect different underlying distributions. The researchers say their method handles the issue of distribution mismatch “via explicitly decoupling batch statistics on normalization layers, thus enabling a better absorption from both adversarial and clean features.”
In experiments AdvProp substantially outperformed the vanilla training baseline on all networks and also on all of the challenging distorted image datasets. AdvProp boosted the larger EfficientNet-B8 trained network to state-of-the-art 85.5 percent top-1 accuracy on ImageNet without any extra data.
The researchers say AdvProp succeeds by enabling models to learn much richer internal representations, and provides them with global shape information for better classifying while also increasing robustness. They also note that accuracy improvements are more significant when the image model is larger.
Le says the key innovation of using two BatchNorms overcomes “the mysterious drop in accuracy” that plagued previous attempts to use adversarial examples as data augmentation. He sees possible future applications for AdvProp beyond image recognition, in domains such as language and structured data.
The paper Adversarial Examples Improve Image Recognition is on arXiv.
Journalist: Fangyu Cai | Editor: Michael Sarazen