Recent developments in large-scale language models have showcased their remarkable ability to tackle a wide array of tasks using a single model. However, a significant challenge lies in comprehending their internal workings, particularly as scaling these models up also raises the bar for interpretability.
In a new paper titled “Neurons in Large Language Models: Dead, N-gram, Positional,” a research team from Meta AI and the Universitat Politècnica de Catalunya embarks on a comprehensive analysis of a family of Open Pre-trained Transformer Language Models (OPT) with parameters ranging up to 66 billion. Their goal is to shed light on how the feed-forward network (FFN) layers function within these models.
The team places particular emphasis on the neurons housed within the FFNs. They contend that FFN neurons are more likely to represent meaningful features. The elementwise nonlinearity within these neurons disrupts rotational invariance, prompting features to align with the basis dimensions.
Essentially, when an FFN neuron is activated, it updates the residual stream by extracting the corresponding row from the second FFN layer. Conversely, when it remains inactive, it has no impact on the residual stream. Armed with these insights, the researchers can decipher the functions of these FFN neurons by understanding when they activate and interpreting the associated updates made to the residual stream.
Their initial observations reveal that a significant portion of neurons never activate across diverse datasets. An analysis of neuron activation frequencies underscores the substantial prevalence of dormant neurons. For instance, in the 66 billion parameter model, some layers exhibit a proportion of dead neurons exceeding 70%.
Subsequently, delving deeper into the patterns embedded in the lower half of the models, the researchers investigate how neuron activations correlate with input n-grams. Their findings indicate that in larger models, neurons are covered by fewer n-grams, aligning with the hypothesis that the model allocates discreet shallow patterns to specifically designated neurons.
Furthermore, the researchers identify certain neurons responsible for encoding positional information, irrespective of textual patterns. This discovery suggests that FFN layers can be employed by the model in ways that extend beyond the conventional key-value memory perspective.
Overall, by conducting analysis of whether an FFN neuron is activated or not on the OPT family of models ranging from 125m to 66b parameters, the team summarizes the discoveries of neurons that:
- are “dead”, i.e. never activate on a large diverse collection of data;
- act as token- and n-gram detectors that, in addition to promoting next token candidates, explicitly remove current token information;
- encode position regardless of textual content which indicates that the role of FFN layers extends beyond the key-value memory view.
The team asserts that this work represents the first instance of mechanisms specialized in removing information from the residual stream. They believe that their findings offer valuable insights into the inner workings of how large language models achieve their impressive capabilities.
The paper Neurons in Large Language Models: Dead, N-gram, Positional on arXiv.
Author: Hecate He | Editor: Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.