Charles Darwin’s nineteenth century theory of natural selection is a cornerstone of modern biology. Considering that today’s deep neural networks (DNN) were originally inspired by the biological networks of neurons that constitute animal brains, is it possible that the paradigm of evolution could also be applied to DNN design and development? A new collaborative study from Google Research and DeepMind explores the possibilities.
In the paper Evolving Normalization-Activation Layers, the researchers unify DNN normalization layers and activation functions into a single computation graph. Unlike existing manual design patterns, the new structure was designed to evolve from low-level primitives rather than relying on well-defined building blocks. The team introduces a set of new normalization-activation layers, EvoNorms, that achieve improved accuracy on a variety of image classification models, including ResNets, MobileNets, and EfficientNets.
The researchers examined EvoNorms generalization ability and determined they transfer well to Mask R-CNN for instance segmentation and BigGAN for image synthesis. The EvoNorm-B0 layer for example outperforms BN-ReLU across various models for image classification and instance segmentation, and shows promising results on GANs.
Normalization layers and activation functions are two important components that frequently co-locate with each other in deep convolutional networks for stable optimization and improved generalization. The paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, which introduced batch normalization and dramatically accelerated deep neural network training, added normalization immediately before the ReLU activation function. Deep learning researchers have since assumed the two must be designed separately and function sequentially.
The Google and DeepMind researchers formulate the normalization and activation layers as a single building block. Each normalization-activation layer is represented as a tensor-to-tensor computation graph. The researchers combined an evolution algorithm with a rejection mechanism to ensure the search of normalization-activation layers in the large and sparse search space was efficient.
The evolution algorithm is a variant of tournament selection, with only winners at each step from the random subset of the population (of layers) able to produce mutated offspring. The rejection mechanism meanwhile enabled researchers to discard undesired layers based on their quality and stability. For example, layers that achieve less than 20 percent validation accuracy in 100 training steps on any of the common architectures (ResNet-CIFAR, MobileNetV2-CIFAR, and EfficientNet-CIFAR) were rejected.
The paper’s first author, Google Brain Research Scientist Liu Hanxiao, tweeted “Key ideas: (1) to start from low-level primitives, and (2) to evolve the layers’ generalization over multiple architectures.”
Automated machine learning (AutoML) has seen much research interest and activity in recent years, with Google coining the term for its own neural architecture search system in May 2017. The researchers say the discovery of the EvoNorms layers and their performance gains points to the potential of AutoML for discovering universal modules using layer search and evolving novel ML concepts from low level primitives.
The paper Evolving Normalization-Activation Layers is on arXiv.
Journalist: Fangyu Cai | Editor: Michael Sarazen