While the progress and power of deep neural networks (DNNs) have accelerated the development of applications such as facial and object recognition, DNNs are known to be vulnerable to a variety of attack strategies. One of the most cunning is backdoor attacks, which can corrupt a training dataset and cause DNNs to produce consistent and repeated misclassifications on inputs marked with a specific “trigger” pattern.
The danger posed by backdoor attacks has raised concerns in both academia and industry, even though most existing backdoor attack methods are often either visible or fragile to preprocessing defence procedures.
In a new paper, a research team from the University of Science and Technology of China, Microsoft Cloud AI, City University of Hong Kong and Wormpex AI Research ramps up the power of backdoor attacks, introducing “Poison Ink,” a robust and invisible method that is resistant to many state-of-the-art defence techniques.
The team summarizes their contributions as:
- We are the first to propose utilizing image structures as a carrier for trigger patterns, and show they have natural advantages over existing trigger pattern designs.
- We design a new backdoor attack framework, Poison Ink, which uses colourized image structures as the trigger pattern and hides the trigger pattern in an invisible way using a deep injection network.
- Extensive experiments demonstrate the stealthiness and robustness of Poison Ink, which is generally applicable to different datasets and network structures.
- Poison Ink works well in different attacking scenarios and has strong resistance to many SOTA defence techniques.
The team’s goals were to enable Poison Ink to maintain model performance on clean data, produce imperceptibly poisoned images that evade human inspection at the inference stage, and maintain high attack effectiveness even if the poisoned images are preprocessed via data transformations.
The Poison Ink pipeline comprises a trigger image generation process, backdoor model training, and backdoor model attacking.
The team generates their trigger patterns by embedding poisoned information into edge structures, then embedding the trigger pattern into the cover image with a deep invisible injection strategy. This novel trigger image generation approach has several advantages over existing attack strategies: 1) It can be easily captured by the shallow layers of DNNs and will not undermine the performance of the original task; 2) The edge structures can keep their semantic meaning and physical existence during data transformation; 3) Edge structures are the inherent high-frequency component of images, so the attack can be effectively rendered invisible.
Having established this edge structure-based trigger pattern, the team then designed a deep injection strategy to hide the trigger pattern in the cover image. The training process uses a deep injection network, an auxiliary guidance extractor network that helps the injection network learn, and an interference layer that enables the injection network to more robustly embed the trigger patterns.
After the backdoor model training, the deep injection network hides the trigger pattern into clean cover images, thus generating the poisoned images.
To demonstrate the invisibility and robustness of the proposed Poison Ink, the team conducted backdoor attacks on three types of classification tasks: CIFAR10 and ImageNet for general image recognition tasks, and GTSRB and VGGFACE for traffic sign recognition and face recognition. They used a clean data accuracy (CDA) metric to evaluate the influence of their backdoor attacks on the original task, and attack success rate (ASR) to evaluate the overall effectiveness of the backdoor attacks.
The team also conducted human inspection testing, in which 30 people were asked to differentiate between image pairs (one clean, one poisoned) produced by different attack methods. While the poisoned images generated by most other methods were easily judged as unclean, the fooling rate of the proposed approach was close to 50 percent, i.e. the probability of random guessing.
The empirical results demonstrate that Poison Ink outperforms existing attack methods in stealthiness, robustness, generality and flexibility, and is resistant to many state-of-the-art defence techniques. The study can serve as a wake-up call for machine learning researchers with regard to the need to develop new and more effective defence strategies against increasingly sophisticated backdoor attacks.
The paper Poison Ink: Robust and Invisible Backdoor Attack is on arXiv.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.