Deep neural networks (DNNs) have achieved astonishing performance on many complex tasks, but a major obstacle impeding their wider application remains the requirement for resource-consuming model retraining every time the task and the subclasses to be classified change.
To address this issue, a research team from Fujitsu AI Laboratory, the University of Tokyo and the RIKEN Center for Advanced Intelligence Project (AIP) has proposed a modularization method that can decompose a DNN into small modules from a functionality perspective then recompose these modules into new models appropriate for other tasks.
The team summarizes their contributions as:
- We applied supermasks to a trained DNN model and constructed a module network, which is a subnetwork of the trained model, by pruning networks. We showed that the trained model for multiclass can be decomposed into subnetworks to classify a single class. We also showed that the classification task of arbitrary subclass sets can be solved by a linear combination of these modules.
- We proposed a new method for learning a supermask that can be trained to prune similar edges between modules. By adding the consistency of the supermask score of each layer to the loss function, we show that the supermask can learn to remove dissimilar edges among modules and can classify them with a smaller parameter size during recomposing.
- We demonstrated the effectiveness of our proposed method for decomposing and recomposing modular networks using open datasets.
In modular neural networks, training is typically done at the architectural level. The resulting modular structure enables faster inference, easier network analysis, and a reduction in both network size and the catastrophic forgetting problems associated with continual and incremental learning.
The researchers did not set out to create a special network architecture with a modular structure, but rather to extract and modularize subnetworks that could be treated as models able to classify a single class in binary. Combining multiple modular networks can thus produce models that can be classified for any subtasks. The team identifies the desired properties of such modular networks as: 1) decomposability, 2) recomposability, and 3) reusability/capability with small parameters.
The proposed method decomposes a trained model for N-class problems into subnetworks for a single problem prediction task, then recomposes neural networks without retraining. The researchers apply supermasks — a binary score matrix that indicates whether each edge of the network should be pruned or not — to the trained model to extract the subnetworks required for the classification of a given subclass set from the trained model. For fine-tuning, they apply a grafting layer that uses all N logits of the N-class trained model (if not masked) to predict the single-class classification. To obtain a subnetwork that can classify only the subtask, they reduce the size of individual modules, making each module’s supermask as similar as possible and constraining the forward inference. They also apply module co-training algorithms for computing stemming loss so the extracted modules can be recomposed without a large size increase.
In their evaluations, the researchers used four fully connected models (FC1, FC2, FC3, FC4) on the MNIST, Fashion-MNIST, CIFAR-10/CIFAR-100 and SVHN datasets.
The results show that the proposed method can perform module decomposition and reconstruct neural networks for predicting classification problems with minimal parameters in the reconstructed modules, and that these new networks can effectively solve subtasks with high accuracy. Moreover, the team says this approach can extract similar edges across modules on several datasets to reduce model size when recomposing modules, can be applied immediately without retraining, and is applicable on arbitrary DNNs.
The paper Neural Network Module Decomposition and Recomposition is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.
Pingback: r/artificial - [R] Fujitsu AI, Tokyo U & RIKEN AIP Study Decomposes DNNs Into Modules That Can Be Recomposed Into New Models for Other Tasks - Cyber Bharat
Pingback: Fujitsu AI, Tokyo U & RIKEN AIP Study Decomposes DNNs Into Modules That Can Be Recomposed Into New Models for Other Tasks – JapanBiZZ