Powerful architectures such as vison transformers (ViTs) have enabled countless performance advancements in computer vision over the last couple of years, stimulating demand for new software and infrastructures to support easy and extensible neural network architecture research in this rapidly expanding field.
In a new paper, a research team from Google Brain and Google Research introduces SCENIC, an open-source JAX library designed to meet these needs in computer vision research and beyond. JAX currently supports implementations of state-of-the-art vision models such as ViT, DETR and MLP Mixer, and more open-sourced cutting-edge projects will be added in the near future.
SCENIC was developed in JAX, an easy-to-use library that enables automatic differentiation of native Python and NumPy functions and supports multi-host and multi-device training on accelerators such as GPUs and TPUs, which the researchers say makes it ideal for large-scale model research.
The team summarizes the SCENIC toolkit as 1) shared light-weight libraries for solving commonly encountered tasks when training large-scale (i.e. multi-device, multi-host) models in vision and beyond; and 2) projects containing fully fleshed-out problem-specific training and evaluation loops using these libraries.
SCENIC is designed as a united and flexible framework and contains both project-level and library-level code. Unlike other libraries, SCENIC can support projects that only require changing hyperparameters as well as those that require customization on the input pipeline, model architecture, losses and metrics or the training loop.
SCENIC also includes optimized implementations of a large set of state-of-the-art research models, including ViT, DETR, MLP Mixer, ResNet, U-Net, etc., in modalities that include video, image, audio and text. SCENIC has been employed in numerous Google research papers, such as ViViT, OmniNet, TokenLearner, MBT, etc., and additional advanced projects will be incorporated into the SCENIC repository in the future.
The team believes SCENIC can help researchers in computer vision and beyond to quickly test and scale ideas for the development of new and improved neural network architectures.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.