The performance of deep neural networks (DNNs) relies heavily on their structures, and designing a good structure (aka architecture) tends to require extensive effort from human experts. The idea of an automatic structure-learning algorithm that can achieve performance on par with the best human-designed structures is thus increasingly appealing to machine learning researchers.
In the paper Learning Structures for Deep Neural Networks, a team from OneFlow and Microsoft explores unsupervised structure learning, leveraging the efficient coding principle, information theory and computational neuroscience to design a structure learning method that does not require labelled information and demonstrates empirically that larger entropy outputs in a deep neural network lead to better performance.
The researchers start with the assumption that the optimal structure of neural networks can be derived from the input features without labels. Their study probes whether it is possible to learn good DNN network structures from scratch in a fully automatic fashion, and what would be a principled way to reach this end.
The team references a principle borrowed from the biological nervous system domain — the efficient coding principle — which posits that a good brain structure “forms an efficient internal representation of external environments.” They apply the efficient coding principle to DNN architecture, proposing that the structure of a well-designed network should match the statistical structure of its input signals.
The efficient coding principle suggests that the mutual information between a model’s inputs and outputs should be maximized, and the team presents a solid Bayesian optimal classification theoretical foundation to support this. Specifically, they show that the top layer of any neural network (softmax linear classifier) and the independency between the nodes in the top hidden layer constitute a sufficient condition for making the softmax linear classifier act as a Bayesian optimal classifier. This theoretical foundation not only backs up the efficient coding principle, it also provides a way to determine the depth of a DNN.
The team then investigates how to leverage the efficient coding principle in the design of a structure-learning algorithm, and shows that sparse coding can implement the principle under the assumption of zero-peaked and heavy-tailed prior distributions. This suggests that an effective structure learning algorithm can be designed based on global group sparse coding.
The proposed structure-learning with sparse coding algorithm learns a structure layer by layer in a bottom-up manner. The raw features are at layer one, and given the predefined number of nodes in layer two, the algorithm will learn the connection between these two layers, and so on.
The researchers also describe how this proposed algorithm can learn inter-layer connections, handle invariance, and determine DNN depth. Finally, they conduct intensive experiments on the popular CIFAR-10 data set to evaluate the classification accuracies of their proposed structure learning method, the role of inter-layer connections, and the role of structure masks and network depth.
The results show that a learned-structure single-layer network achieves an accuracy of 63.0 percent, outperforming the single-layer baseline of 60.4 percent. In an inter-layer connection density evaluation experiment, the structures generated by the sparse coding approach outperform random structures, and at the same density level, always outperform the sparsifying-restricted Boltzmann machines (RBM) baseline. In the team’s structure mask role evaluation, the structure prior provided by sparse coding is seen to improve performance. The network depth experiment meanwhile empirically justifies the proposed approach for determining DNN depth via coding efficiency.
Overall, the research proves the efficient coding principle’s effectiveness for unsupervised structure learning, and that the proposed global sparse coding-based structure-learning algorithms can achieve performance comparable with the best human-designed structures.
The paper Learning Structures for Deep Neural Networks is on arXiv.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.
Pingback: r/artificial - [R] Microsoft & OneFlow Leverage the Efficient Coding Principle to Design Unsupervised DNN Structure-Learning That Outperforms Human-Designed Structures - Cyber Bharat
thanks for this
I am still learning web design and am unfamiliar with some phrases. I’ve already run into some issues, so your suggestions for streamlining the workflow would be quite helpful. I have a lot of ideas, and I get a lot of them from MasterBundles , but putting them into practice is often challenging. I’m hoping that with practice, it will become much easier. Thank you for your suggestions.
Efficient coding principle to design is a basic guideline for creating software that is easy to read understand and maintain. This principle can be applied to any level of coding from high-level languages like Java to low-level languages like Assembly. I often get marketing dissertation help online as well as the goal is to produce code that is easy for both humans and machines to process.
I think DNNs are very good at performing tasks that require lots of data. For example, they can be used to classify images. They have a lot of advantages over traditional computer vision algorithms. I often use nicotine free 6 packs as it is rich in flavors. DNNs can be trained in a very efficient manner because the training data for each stage is just a small subset of the whole dataset. Also, they learn to recognize patterns rather than specific objects, which makes them more generalizable.