DeepMind & Stanford U’s UNFs: Advancing Weight-Space Modeling with Universal Neural Functionals

In the realm of machine learning, addressing weight-space features like weights, gradients, or sparsity masks of neural networks is often pivotal. Recent endeavors have yielded encouraging progress in developing weight-space models that exhibit equivariance to the permutation symmetries inherent in straightforward feedforward networks. However, extending these advancements to encompass more complex architectures has proven challenging, as the intricate permutation symmetries of weight spaces can be compounded by recurrent or residual connections.

In a new paper Universal Neural Functionals, a research team from Google DeepMind and Stanford University introduces a groundbreaking algorithm known as universal neural functionals (UNFs). This algorithm autonomously constructs permutation-equivariant models for any weight space, offering a versatile solution to the architectural constraints encountered in prior works. Furthermore, the researchers showcase the applicability of UNFs by seamlessly integrating them into existing learned optimizer designs, revealing promising enhancements over previous methodologies when optimizing compact image classifiers and language models.

The core assertion made by the team revolves around the preservation of equivariance under composition, coupled with the inherent permutation equivariance of pointwise non-linearities. This foundational premise underscores the feasibility of constructing deep equivariant models provided an equivariant linear layer is available. Additionally, combining equivariant layers with an invariant pooling operation facilitates the creation of deep invariant models, further expanding the scope of applications.

The proposed algorithm operates by automatically establishing a foundation for permutation-equivariant maps between arbitrary rank tensors. Each basis function is realized through straightforward array operations, ensuring compatibility with modern deep learning frameworks and enabling efficient computation.

The construction of universal neural functionals entails the stacking of multiple layers interleaved with pointwise non-linearities, thereby forming a deep, permutation-equivariant model capable of processing weights. To devise a permutation-invariant model, an invariant pooling layer is appended after the equivariant layers, ensuring robustness across different permutations.

In their empirical evaluation, the researchers contrasted the performance of UNFs against prior methods across two categories of weight-space tasks: predicting the generalization of recurrent sequence-to-sequence models and training learned optimizers for diverse architectures and datasets. The results unequivocally demonstrate the efficacy of UNFs in tasks involving the manipulation of weights and gradients in various domains, including convolutional image classifiers, recurrent sequence-to-sequence models, and Transformer language models. Particularly noteworthy are the promising enhancements observed over existing learned optimizer designs in small-scale experiments.

In summary, the introduction of universal neural functionals represents a significant stride in the advancement of weight-space modeling, offering a versatile and effective framework for addressing permutation symmetries in neural network architectures. Through its automated construction of permutation-equivariant models, UNFs stand poised to facilitate further breakthroughs in machine learning research and applications.

The paper Universal Neural Functionals is on arXiv.

Author: Hecate He | Editor: Chain Zhang

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

DeepMind & Stanford U’s UNFs: Advancing Weight-Space Modeling with Universal Neural Functionals

Like this:

0 comments on “DeepMind & Stanford U’s UNFs: Advancing Weight-Space Modeling with Universal Neural Functionals”

Leave a Reply Cancel reply

Related

Share this:

Like this:

0 comments on “DeepMind & Stanford U’s UNFs: Advancing Weight-Space Modeling with Universal Neural Functionals”

Leave a Reply Cancel reply

Related