The increasingly impressive performance of deep neural networks (DNNs) in recent years has come at the cost of increasingly high computation burdens. As such, the design of efficient and even optimal architectures is vital for continued DNN development and deployment.
To advance research in this area, a Purdue University team has introduced a self-adaptive algorithm for optimal DNN design. Their adaptive network enhancement (ANE) method learns not only from given information but also from the current computer simulation.
Despite the achievements of DNNs, their approximation properties remain unclear, and researchers still can not find fully understand why and how the networks have produced such significant improvements across a wide range of machine learning applications.
Although studies have suggested that deep networks can approximate many functions more accurately than shallow networks, rigorous proofs have yet to emerge to support the theoretical advantage of deep networks. Also, most current methods for DNN architecture design remain empirical, requiring tedious manual tuning of depth, width and hyperparameters.
In the paper Self-Adaptive Deep Neural Network: Numerical Approximation to Functions and PDEs, the Purdue researchers ask: What is the optimal network model required, in terms of width, depth, and the number of parameters; to learn data, a function, or the solution to a PDE within some prescribed accuracy? Their ANE method seeks to provide the answers.
ANE can be written as loops of the form train → estimate → enhance. The train step aims at solving the optimization problem of the current neural network (NN), the estimate step computes a posteriori estimators using the solution at the current NN; and the enhance step adds new neurons to the current NN.
The researchers say because their adaptive learning algorithm learns not only from given information such as data, function and partial differential equations (PDEs) but also from the current computer simulation, it is more advanced than common machine learning algorithms.
Two essential questions are addressed at each adaptive step of the ANE method: 1) How many new neurons should be added? and 2) When should a new layer be added?
For a two-layer NN, the team introduces a network strategy that, based on the physical partition of the current computer simulation determined by the a posteriori error indicators, determines the number of new neurons to be added in the first hidden layer.
For a multi-layer NN, the team exploits the geometric properties of the current computer simulation and introduces a novel enhancement strategy to decide the number of new neurons that should be added at hidden layers beyond the first hidden layer. To determine when a new layer should be added, they introduce a computable quantity to measure the improvement rate of two consecutive NNs per the relative increase of the parameters.
Determining the values of a DNN’s parameters in the training stage is a challenging nonlinear optimization question, as the high dimensional, nonlinear optimization questions are often computationally intensive. One desirable solution is to obtain a good initialization.
To this end, the proposed ANE method uses the physical partition of the domain to initialize the weights and biases of newly added neurons at the first hidden layer, learning solutions to the linear advection-reaction equations through a least-squares neural network (LSNN).
Through the application of these techniques, the team demonstrates that the proposed ANE method can automatically design a nearly minimal NN for learning functions exhibiting sharp transitional layers as well as discontinuous solutions of hyperbolic PDEs.
In the future, the researchers plan to extend the application of self-adaptive DNNs to tasks such as data fitting, classification, etc., where training data is limited. They believe the ANE method also has the potential to resolve the overfitting issue when data is noisy.
The paper Self-Adaptive Deep Neural Network: Numerical Approximation to Functions and PDEs is on arXiv.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.