AI Technology

SJTU & MIT Paper Reinvents Neural Architecture Search; Slashes Computational Resource Requirements

The dearth of AI talents capable of manually designing neural architecture such as AlexNet and ResNet has spurred research in automatic architecture design. Google's Cloud AutoML is an example of a system that enables developers with limited machine learning expertise to train high quality models. The trade-off, however, is AutoML's high computational costs.

The dearth of AI talents capable of manually designing neural architecture such as AlexNet and ResNet has spurred research in automatic architecture design. Google’s Cloud AutoML is an example of a system that enables developers with limited machine learning expertise to train high quality models. The trade-off, however, is AutoML’s high computational costs.

Now, a research team from Shanghai Jiao Tong University (SJTU) and MIT has introduced a new function-preserving transformation for efficient neural architecture search. Their paper, Path-Level Network Transformation for Efficient Architecture Search, was published last month on arXiv.

The path-level network transformation is the lynchpin of the paper. Compared to Google Brain’s NASNet-A, which finds neural architecture in a designed search space from scratch; or SJTU’s Net2Net operations, which takes advantage of existing neural architecture; the authors’ method allows replacing a single layer with a multi-branch structure whose merge scheme is either add or concatenation.

Researchers also proposed a tree-structured architecture space for search and a reinforcement learning agent as the meta-controller to explore the tree-structured architecture space.

The abstract explains: “Our proposed path-level transformation operations enable the meta-controller to modify the path topology of the given network while keeping the merits of reusing weights, and thus allow efficiently designing effective structures with complex path topologies like Inception models.”

In an experiment in learning CNN cell structures on CIFAR-10, the path-level network transformation technique used 200 GPU hours of computational resources compared to NASNet-A at 48,000 GPU hours, and achieved better performance, particularly in parameter efficiency, and better overall results.


Journalist: Tony Peng | Editor: Michael Sarazen

0 comments on “SJTU & MIT Paper Reinvents Neural Architecture Search; Slashes Computational Resource Requirements

Leave a Reply

Your email address will not be published.

%d bloggers like this: