In the almost 80 years since they were first envisioned, compute-fueled artificial neural networks (ANNs) have made tremendous progress toward replicating the function of the mammalian brains that inspired them. Although today’s systems have surpassed humans in many tasks, fundamental performance gaps remain. ANNs are basically a substantial simplification of a human brain’s neural circuits, and as such lack many important human brain functions, such as synaptic integration and local regulation of weight strength. ANNs can also suffer from issues such as long training times, catastrophic forgetting and an inability to exploit increasing network complexity.
Advancing research in these areas is key to closing the gap between how artificial networks perform and human/animal intelligence. In a new study, a team from IBM and ETH Zurich makes progress in reconciling neurophysiological insights with machine intelligence, proposing a novel biologically inspired optimizer for ANNs and spiking neural networks (SNNs) that incorporates synaptic integration principles from biology. Dubbed “GRAPES” (Group Responsibility for Adjusting the Propagation of Error Signals), the optimizer leads to improvements on training time convergence, accuracy and scalability of ANNs and SNNs.
Synaptic integration is the process by which neurons in the central nervous system integrate the thousands of received synaptic inputs with nerve impulses outputs. Previous work has suggested that the powerful computational abilities of neurons stem from these complex nonlinear dynamics and the fact that local weight distributions are used to boost the input signal at specific nodes. Like human brains, ANN nodes receive many inputs and produce a single output, but they lack a weight distribution mechanism during the training phase.
Synaptic plasticity in the brain is driven mainly by local signals. These local interactions between synapses play an important role in regulating weight changes during learning. The training of standard ANNs however relies on global signals instead of local information, which weakens ANNs’ relative learning capacity.
The proposed GRAPES deep learning optimizer is inspired by biological mechanisms and designed to boost the training of fully connected neural networks (FCNNs). It can also be easily applied to more biologically plausible neuronal models such as spike neural networks (SNNs).
The GRAPES algorithm modulates the error signal at each synaptic weight based on node importance and local modulation factor. With these quantities at hand, the team can adjust the error signal through a Hadamard multiplication of the weight-update matrix with the local modulation matrix. In the propagating version of the algorithm, the modulation factor is integrated with the error signal and propagated to the upstream layers, where it is incorporated in the respective weight updates. This propagating version greatly improves neural network classification accuracy and convergence speed compared to the local version.
The error signal modulation implemented in GRAPES has two analogies with the biological mechanisms of heterosynaptic competition and synaptic scaling: the total weight information is used to adjust the weight update, leading to either strengthening or weakening of the synapses; and the local modulation factor is equal for all synapses coming to the same node.
The team conducted several experiments to test GRAPES’ benefits on ANN training. On the task of handwritten digit classification, they reported test curves and related slowness fits for 10 × 256 ReLU networks trained on the MNIST data set, demonstrating that the testing curve for the GRAPES model saturates at a significantly higher accuracy plateau and has consistently smaller slowness parameters than the baseline SGD models.
The team also demonstrated GRAPES’ potential to boost SNN performance on temporal data. Applied to architectures implemented through a spiking neural unit (SNU) approach, GRAPES surpassed the SGD classification accuracy for different layer sizes.
The study introduces a novel concept of node importance, while GRAPES provides a simple and efficient strategy for dynamically adjusting error signals at each node, which is beneficial in mitigating the performance degradation caused by hardware-related constraints and can boost performance, training efficiency and model scalability.
The paper Learning in Deep Neural Networks Using a Biologically Inspired Optimizer is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.