Understanding the relative importance of input information on the neural network learning process could lead to improved model interpretability and new scientific discoveries. A popular method for finding connections between information and learning is using heuristics-based ablation techniques to mask or remove information to create simpler versions of an input, then analyzing the network’s predictions based on these simpler inputs. But might there be a better way?
In the new paper When Less is More: Simplifying Inputs Aids Neural Network Understanding, a research team from University Medical Center Freiburg, ML Collective, and Google Brain introduces SimpleBits — an information-reduction method that learns to synthesize simplified inputs that contain less information yet remain informative for the task, providing a new method for exploring the basis of network decisions.
The researchers set out to answer two questions: How do neural network image classifiers respond to simpler and simpler inputs? And what do such responses reveal about the learning process? The paper identifies several requirements for answering these questions: a clear measure of input simplicity, an optimization objective that correlates with simplification, a framework incorporating such objectives into training and inference, and testbeds for evaluating the impact of this simplification on learning.
To measure input simplicity, the team leverages previous work (Kirichenko et al., 2020; Schirrmeister et al., 2020) showing that generative image models tend to assign higher probability densities and hence lower bits to visually simpler inputs. They use this finding to minimize the encoding bit size given by a generative network trained on a general image distribution and optimize simplified inputs that still preserve task-relevant information.
The team explores how SimpleBits affects network behaviour in a variety of scenarios, including conventional training, dataset condensation and post-hoc explanations, summarizing the resulting insights as:
- Per-instance simplification during training. SimpleBits successfully removes superfluous information for tasks with injected distractors. On natural image datasets, SimpleBits highlights plausible task-relevant information (shape, colour, texture). Increasing simplification leads to accuracy decreases, and we report the trade-off between simplifying inputs and task level performance for different datasets.
- Dataset simplification with condensation. We evaluate SimpleBits applied to a condensation setting that processes the training data into a much smaller set of synthetic images. SimpleBits simplifies these images to drastically reduce the encoding size without substantial task performance decrease. On a chest radiograph dataset (Johnson et al., 2019a,b), SimpleBits can uncover known radiologic features for pleural effusion and gender.
- Post-training auditing. For a trained model, we explore the use of SimpleBits as an interpretability tool for auditing processes. Our exploration of SimpleBits guided audits suggests it can provide intuition into model behaviour on individual examples, including finding features that may contribute to misclassifications.
Overall, SimpleBits introduces an effective way to simplify model inputs that does not require domain-specific knowledge to constrain which input components should be removed, and instead learns to remove those input components which are least relevant for a given task. The researchers believe the work can help in identifying just what information is required for a deep network classifier to learn its task and aid in the more general study of neural network behaviours.
The paper When Less is More: Simplifying Inputs Aids Neural Network Understanding is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.