When machine learning and computer vision models are trained to solve image classification or segmentation tasks, they can sometimes reproduce biases buried in the training dataset. Conversely, the detection of such biases can signal that a certain dataset was used to train the model solving a particular task. Inspired by this dataset traceability phenomenon, researchers from Facebook AI Research and the French National Institute for Research in Digital Science and Technology (INRIA) have proposed a “radioactive data” technique for subtly marking images in a dataset to help researchers later determine whether they were used to train a particular model.
Unlike data-poisoning or backdoor methods where manipulated samples are added to a training set to negatively affect model performance, Facebook AI’s approach involves “clean-label” attacks that provide statistical integrity guarantees in the form of a p-value.
The radioactive data-marking method comprises a marking stage, training stage and detection stage. In the marking stage, “radioactive” marks or “data isotopes” are added to vanilla training images without changing their labels. The training stage uses vanilla and/or marked images to train multi-class classifiers using regular learning algorithms. Finally, in the detection stage, researchers examine the models to determine whether marked data was used or not.
The term “radioactive data” was inspired by the use of radioactive markers in medical imaging to allow doctors for example to more clearly view and interpret X-rays. The researchers say the markers remain present during the whole learning process and are detectable with high confidence in a neural network while having no negative effect on models’ classification accuracy. The method also provides a level of confidence (p-value) estimate.
Experiment results suggest the radioactive data method is effective on large-scale computer vision tasks such as classification on ImageNet with modern architecture (ResNet-18 and ResNet-50), even when only a very small fraction (1 percent) of the training data is radioactive. The proposed method is robust to data augmentation and the stochasticity of deep network optimization and delivers a much higher signal-to-noise ratio than data poisoning or backdoor methods.
Researchers believe the radioactive method is appropriate for real use cases, while also acknowledging its limitations in adversarial scenarios — as it is assumed there would be no distinction between vanilla and radioactive data during training.
The proposal is thus restricted to a proof of concept at this time. For future research, they hope to address a more challenging scenario under Kerckhoff’s axiom — namely, that “a cryptographic system should be secure even if everything about the system, except the key, is public knowledge.”
The paper Radioactive Data: Tracing Through Training is on arXiv.
Author: Yuqing Li | Editor: Michael Sarazen