There are countless neural network architectures and possible variations available to machine learning researchers. Typically, the manual creation of a new network architecture for a particular task is a trial-and-error process that requires much time and heavy compute resources, and can also produce designs that struggle to scale up. Although various neural architecture search (NAS) techniques have been introduced to automate the discovery of top-performing neural networks, these are hardly a panacea, as they also require either time-consuming training of supernets or intensive architecture evaluations. This again results in heavy resource consumption, while also potentially introducing search biases due to truncated training or approximations.
Now, researchers from the University of Texas, Austin have proposed a novel framework called Training-Free Neural Architecture Search (TE-NAS) for “training-free” neural architecture search. The method can effectively select the best neural architectures without any training, significantly reducing NAS cost while also improving speed. It is introduced in the ICLR 2021 paper Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective.
The researchers leveraged two theoretically inspired indicators — Neural Tangent Kernel (NTK) condition and the number of linear regions in the input space — and designed a novel pruning-based method that achieves a superior trade-off between them. The two indicators strongly correlate with network performance and help reduce the cost of the decoupled analysis of network trainability and expressivity.
In experiments on the NAS- Bench-201 and DARTS search spaces using one GTX 1080Ti, TE-NAS conducted high-quality searches in just 0.5 GPU hours on the CIFAR-10 dataset and 4 GPU hours on ImageNet. The team says TE-NAS is the first approach that bridges the gap between the theoretical findings of deep neural networks and real-word NAS application, and hopes the work can inspire further research in this area.
The paper Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective is available on OpenReview. The code is on GitHub.
Analyst: Yuqing Li | Editor: Michael Sarazen; Fangyu Cai
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.