A master detective may examine a cigarette discarded in an ashtray and a strand of hair on a lapel and then declare they’ve solved the murder. Such amazing conclusions are reached through inferential reasoning, a nuanced and uniquely human technique that can form predictions based on connections between seemingly disparate or distanced items and events. Inference is a hot topic for today’s neural network researchers, but even SOTA models still struggle to achieve good performance.
Now, DeepMind and University College London (UCL) have introduced a new deep network called MEMO which matches SOTA results on Facebook’s bAbI dataset for testing text understanding and reasoning, and is the first and only architecture capable of solving long sequence novel reasoning tasks.
Due to high similarity between the bAbI training set and test set, neural networks can generate unreliable results through overfitting. The researchers therefore introduced a new task called Paired Associative Inference (PAI), which is based on neuroscientific literature and designed to test long distance inference.
PAI is entirely procedurally generated, so neural networks need to learn the indirect relations between individual items. PAI displays image pairs, and the model attempts to associate two images that were individually paired with a common image.
Unlike other memory-augmented architectures such as End-to-End Memory Networks (EMN) that use fixed memory representations, MEMO learns a linear projection paired with a powerful recurrent attention mechanism on memory clips with full details. Based on the same fundamental structure as EMN, MEMO was designed with new architectural components to support inferential reasoning.
In a REMERGE model (recurrency and episodic memory results in generalization), retrieved memory is viewed as a new query entering a recirculation process for the network to find a match with an existing memory. The network considers the differences between content retrieved at different time steps in the re-circulation process to settle into a fixed point.
Inspired by REMERGE and adaptive computation time (ACT), researchers trained MEMO with REINFORCE — a class of associative reinforcement learning algorithms for connectionist networks — to identify the optimal number of computation steps, and thus minimize the required computation without sacrificing performance.
In experiments, researchers tested MEMO against SOTA architectures EMN, Differential Neural Computer (DNC), and Universal Transformer (UT) on PAI inference query sets. MEMO showed an overall higher score on all the sets, performing especially well on the furthest length A-E, where it almost doubled the other architectures’ scores.
Researchers also tested MEMO performance on the bAbI dataset, where it matched the UT architecture’s SOTA results while recording a lower error rate.
Author: Reina Qi Wan | Editor: Michael Sarazen