AI Computer Vision & Graphics Machine Learning & Data Science Research

Google Open-Sources SCENIC: A JAX Library for Rapid Computer Vision Model Prototyping and Cutting-Edge Research

A research team from Google Brain and Google Research introduces SCENIC, an open-source JAX library for fast and extensible computer vision research and beyond. JAX currently supports implementations of state-of-the-art vision models such as ViT, DETR and MLP Mixer, and more open-sourced cutting-edge projects will be added in the near future.

Powerful architectures such as vison transformers (ViTs) have enabled countless performance advancements in computer vision over the last couple of years, stimulating demand for new software and infrastructures to support easy and extensible neural network architecture research in this rapidly expanding field.

In a new paper, a research team from Google Brain and Google Research introduces SCENIC, an open-source JAX library designed to meet these needs in computer vision research and beyond. JAX currently supports implementations of state-of-the-art vision models such as ViT, DETR and MLP Mixer, and more open-sourced cutting-edge projects will be added in the near future.

SCENIC was developed in JAX, an easy-to-use library that enables automatic differentiation of native Python and NumPy functions and supports multi-host and multi-device training on accelerators such as GPUs and TPUs, which the researchers say makes it ideal for large-scale model research.

The team summarizes the SCENIC toolkit as 1) shared light-weight libraries for solving commonly encountered tasks when training large-scale (i.e. multi-device, multi-host) models in vision and beyond; and 2) projects containing fully fleshed-out problem-specific training and evaluation loops using these libraries.

SCENIC is designed as a united and flexible framework and contains both project-level and library-level code. Unlike other libraries, SCENIC can support projects that only require changing hyperparameters as well as those that require customization on the input pipeline, model architecture, losses and metrics or the training loop.

SCENIC also includes optimized implementations of a large set of state-of-the-art research models, including ViT, DETR, MLP Mixer, ResNet, U-Net, etc., in modalities that include video, image, audio and text. SCENIC has been employed in numerous Google research papers, such as ViViT, OmniNet, TokenLearner, MBT, etc., and additional advanced projects will be incorporated into the SCENIC repository in the future.

The team believes SCENIC can help researchers in computer vision and beyond to quickly test and scale ideas for the development of new and improved neural network architectures.

The SCENIC code, etc., has been open-sourced on the project’s GitHub. The paper SCENIC: A JAX Library for Computer Vision Research and Beyond is on arXiv.


Author: Hecate He | Editor: Michael Sarazen


We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

1 comment on “Google Open-Sources SCENIC: A JAX Library for Rapid Computer Vision Model Prototyping and Cutting-Edge Research

  1. Pingback: r/artificial - [R] Google Open-Sources SCENIC: A JAX Library for Rapid Computer Vision Model Prototyping and Cutting-Edge Research - Cyber Bharat

Leave a Reply

Your email address will not be published.

%d bloggers like this: