Data plays a foundational role in the development of machine learning models, and data augmentation techniques are a powerful way to increase the amount and diversity of data, reduce overfitting, and improve model performance. A well-designed data augmentation strategy however typically requires time-consuming manual human interventions informed by strong domain expertise and prior knowledge with regard to the dataset of interest.
In the ICLR 2022 conference paper Deep AutoAugment, a research team from Michigan State University and Amazon Web Services proposes Deep AutoAugment (DeepAA), a fully automated data augmentation search method that eliminates the need for hand-crafted default transformations.
The team summarizes their main contributions as:
- We propose Deep AutoAugment (DeepAA), a fully automated data augmentation search method that finds a multi-layer data augmentation policy from scratch.
- We formulate such multi-layer data augmentation search as a regularized gradient matching problem. We show that maximizing cosine similarity along the direction of low variance is effective for data augmentation search when augmentation layers go deep.
- We address the issue of exponential growth of the dimensionality of the search space when more augmentation layers are added by incrementally adding augmentation layers based on the data distribution transformed by all the previous augmentation layers.
- Our experiment results show that, without using any default augmentations, DeepAA achieves stronger performance compared with prior works.
Automating data augmentation policy design is considered a promising paradigm, and a seminal work in this area is AutoAugment (Cubuk et al., 2019), which applies a reinforcement learning framework to the task. While subsequent AutoAugment-based and other approaches have reduced compute burdens and boosted performance, today’s state-of-the-art automated data argumentation methods still require strong domain knowledge injection by experts.
The proposed DeepAA is a multi-layer data augmentation search method designed to eliminate the need for hand-crafted default transformations. It fully automates the data augmentation process by searching a deep data augmentation policy on an extended set of transformations that contains the widely adopted search space and the default transformations.
The team formulates data augmentation policy search as a regularized gradient matching problem, making it possible to search for optimal data augmentation policies by maximizing the cosine similarity of the gradients between augmented data and original data with regularization. To solve the search space’s exponential growth of dimensionality problem when more augmentation layers are introduced, the researchers incrementally stack the augmentation layers based on the data distribution transformed by all the previous augmentation layers.
In their evaluations, the researchers compared the proposed DeepAA to benchmark automated data augmentation search methods such as Faster AutoAugment, RandAugment, Adversarial AutoAugment, etc., on the CIFAR-10, CIFAR-100, and ImageNet datasets.
The team reports that, without any default augmentations, DeepAA achieved the strongest performance compared to the baseline automatic augmentation search methods on the CIFAR-10 dataset, CIFAR-100 on Wide-ResNet-28-10 and ImageNet on ResNet-50 and ResNet-200.
The team also identifies three interesting observations from their results:
- AA, FastAA and DADA assign high probability (over 1.0) on flip, Cutout and crop, as those transformations are hand-picked and applied by default. DeepAA finds a similar pattern that assigns high probability on flip, Cutout and crop.
- Unlike AA, which mainly focused on colour transformations, DeepAA has high probability over both spatial and colour transformations.
- FastAA has evenly distributed magnitudes, while DADA has low magnitudes (common issues in DARTS-like method). Interestingly, DeepAA assigns high probability to the stronger magnitudes.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.