Computer Vision & Graphics | Synced

by Synced 2022-02-15 0

UC Berkeley, Waymo & Google’s Block-NeRF Neural Scene Representation Method Renders an Entire San Francisco Neighbourhood

In the new paper Block-NeRF: Scalable Large Scene Neural View Synthesis, a team from UC Berkeley, Waymo and Google Research proposes Block-NeRF, a neural radiance fields variant capable of representing city-scale environments.

by Synced 2022-02-14 0

AI Computer Vision & Graphics Machine Learning & Data Science Research

Google’s MaskGIT Outperforms SOTA Transformer Models on Conditional Image Generation and Accelerates Autoregressive Decoding by up to 64x

A Google Research team proposes Masked Generative Image Transformer (MaskGIT), a novel image synthesis paradigm that uses a bidirectional transformer decoder. MaskGIT significantly outperforms state-of-the-art transformer models on the ImageNet dataset and accelerates autoregressive decoding by up to 64x.

by Synced 2022-01-24 2

AI Computer Vision & Graphics Machine Learning & Data Science Research

Meta AI’s OMNIVORE: A Modality-Agnostic Single Vision Model With Cross-Modal Generalization

A Meta AI research team presents OMNIVORE, a single vision model for various visual modalities that can perform cross-modal generalization and achieves performance at par or better than traditional modality-specific models of the same size.

by Synced 2022-01-17 15

AI Computer Vision & Graphics Machine Learning & Data Science Popular Research

Pushing the Limits of Self-Supervised ResNets: DeepMind’s ReLICv2 Beats Strong Supervised Baselines on ImageNet

A DeepMind research team proposes ReLICv2, which demonstrates for the first time that representations learned without labels can consistently outperform a strong, supervised baseline on ImageNet and even achieve comparable results to state-of-the-art self-supervised vision transformers (ViTs).

by Synced 2022-01-13 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

Facebook AI & UC Berkeley’s ConvNeXts Compete Favourably With SOTA Hierarchical ViTs on CV Benchmarks

A team from Facebook AI Research and UC Berkeley proposes ConvNeXts, a pure ConvNet model that achieves performance comparable with state-of-the-art hierarchical vision transformers on computer vision benchmarks while retaining the simplicity and efficiency of standard ConvNets.

by Synced 2021-12-29 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

ETH Zurich Proposes Exemplar Transformers: Robust Visual Tracking That’s 8x Faster and CPU-Compatible

In the new paper Efficient Visual Tracking with Exemplar Transformers, ETH Zurich researchers propose Exemplar Transformers for real-time visual object tracking that’s up to 8× faster than other transformer-based models.

by Synced 2021-12-24 9

AI Computer Vision & Graphics Machine Learning & Data Science Research

OpenAI Releases GLIDE: A Scaled-Down Text-to-Image Model That Rivals DALL-E Performance

An OpenAI research team proposes GLIDE (Guided Language-to-Image Diffusion for Generation and Editing) for high-quality synthetic image generation. Human evaluators prefer GLIDE samples over DALL-E’s, and the model size is much smaller (3.5 billion vs. 12 billion parameters).

by Synced 2021-12-22 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

Facebook AI & JHU’s MaskFeat Method Surpasses Kaiming He’s MAE, Sets New SOTA in Video Action Recognition

In the new paper Masked Feature Prediction for Self-Supervised Visual Pre-Training, a Facebook AI Research and Johns Hopkins University team presents a novel Masked Feature Prediction (MaskFeat) approach for the self-supervised pretraining of video models that achieves SOTA results on video benchmarks.

by Synced 2021-12-16 4

AI Computer Vision & Graphics Machine Learning & Data Science Research

NVIDIA’s AdaViT Halts Token Computation to Adaptively Adjust ViT Inference Cost on Images of Different Complexity

Nvidia researchers propose AdaViT, an input-dependent mechanism that adaptively adjusts vision transformers’ inference cost by halting the compute of different tokens at different depths to reserve compute for discriminative tokens.

by Synced 2021-11-29 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

Microsoft’s ‘Florence’ General-Purpose Foundation Model Achieves SOTA Results on Dozens of CV Benchmarks

In the paper A New Foundation Model for Computer Vision, a Microsoft research team proposes Florence, a novel foundation model for computer vision that significantly outperforms previous large-scale pretraining approaches and achieves new SOTA results across a wide range of visual and visual-linguistic benchmarks.

by Synced 2021-11-22 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

Microsoft Asia’s Swin Transformer V2 Scales the Award-Winning ViT to 3 Billion Parameters and Achieves SOTA Performance on Vision Benchmarks

Microsoft Research Asia has upgraded their Swin Transformer with a new version featuring three billion parameters to train images with resolutions up to 1,536 x 1,536 and advance the SOTA on four representative vision benchmarks.

by Synced 2021-11-15 2

AI Company Computer Vision & Graphics Global News Research US & Canada

Google’s Pet Portraits Will Find Art Doubles for Your Pets

Google recently has launched an adorable new feature for its Arts and Culture app named Pet Portraits that can compare your pet photo to artworks from museums worldwide and find their art doubles.

by Synced 2021-11-15 2

AI Computer Vision & Graphics Machine Learning & Data Science Popular Research

A Leap Forward in Computer Vision: Facebook AI Says Masked Autoencoders Are Scalable Vision Learners

In a new paper, a Facebook AI team advances autoencoding methods to the computer vision field and shows that masked autoencoders (MAE) are scalable self-supervised learners.

by Synced 2021-10-29 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

DeepMind Study Resolves Delusions in Sequence Models for Interaction and Control

In the new paper Shaking the Foundations: Delusions in Sequence Models for Interaction and Control, a DeepMind research team explores the origins of mismatch problems in sequence models that lack understanding of the cause and effect of their actions, and addresses the problem by treating actions as causal interventions.

by Synced 2021-10-28 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

Softmax-free Vision Transformer With Linear Complexity: Achieving a Superior Accuracy/Complexity Trade-off

Researchers from Fudan University, University of Surrey and Huawei Noah’s Ark Lab identify the limitations of quadratic complexity for vision transformers (ViTs) as rooted in keeping the softmax self-attention during approximations. The team proposes the first softmax-free transformer (SOFT), which reduces the self-attention computation to linear complexity, achieving a superior trade-off between accuracy and complexity.

by Synced 2021-10-27 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

Google Open-Sources SCENIC: A JAX Library for Rapid Computer Vision Model Prototyping and Cutting-Edge Research

A research team from Google Brain and Google Research introduces SCENIC, an open-source JAX library for fast and extensible computer vision research and beyond. JAX currently supports implementations of state-of-the-art vision models such as ViT, DETR and MLP Mixer, and more open-sourced cutting-edge projects will be added in the near future.

by Synced 2021-10-19 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

StyleNeRF: A 3D-Aware Generator for High-Resolution Image Synthesis with Explicit Style Control

In a paper currently under double-blind review for ICLR 2022, researchers propose StyleNeRF, a 3D-aware generative model that can synthesize high-resolution images at interactive rates while preserving high-quality 3D consistency, and can even generalize to unseen views with control on styles and poses.

by Synced 2021-10-13 0

AI Community Computer Vision & Graphics Global Global News Research

ICCV 2021 Best Papers Announced

On October 13, ICCV 2021 announced its Best Paper Awards, honourable mentions, and Best Student Paper.

by Synced 2021-10-12 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

Are Patches All You Need? New Study Proposes Patches Are Behind Vision Transformers’ Strong Performance

A research team proposes ConvMixer, an extremely simple model designed to support the argument that the impressive performance of vision transformers (ViTs) is mainly attributable to their use of patches as the input representation. The study shows that ConvMixer can outperform ViTs, MLP-Mixers and classical vision models.

by Synced 2021-10-08 0

AI Computer Vision & Graphics Machine Learning & Data Science Research

Apple Study Reveals the Learned Visual Representation Similarities and Dissimilarities Between Self-Supervised and Supervised Methods

An Apple research team performs a comparative analysis on a contrastive self-supervised learning (SSL) algorithm (SimCLR) and a supervised learning (SL) approach for simple image data in a common architecture, shedding light on the similarities and dissimilarities in their learned visual representation patterns.

by Synced 2021-10-06 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

Google Significantly Improves Visual Representations by Adding Explicit Information Compression

A Google Research team presents compressive variants of SimCLR and BYOL that yield better and more robust visual representations.

by Synced 2021-10-04 2

AI Computer Vision & Graphics Machine Learning & Data Science Research

Debiasing Image Datasets: Oxford University Presents PASS, an ImageNet Replacement for Self-Supervised Pretraining

An Oxford University research team presents PASS, a large (1.28M) image collection excluding humans, created as an ImageNet replacement for self-supervised pretraining without technical, ethical or legal issues.

by Synced 2021-08-27 8

AI Computer Vision & Graphics Machine Learning & Data Science Research

Google Brain Uncovers Representation Structure Differences Between CNNs and Vision Transformers

A Google Brain research team explores the internal representation structures of ViTs and CNNs on image classification tasks, providing insights on key differences between the two approaches.

by Synced 2021-08-23 3

AI Computer Vision & Graphics Machine Learning & Data Science Research

UCSD & Microsoft Improve Image Recognition With Extremely Low FLOPs

A research team from University of California San Diego and Microsoft proposes Micro-Factorized Convolution (MF-Conv), a novel approach that can deal with extremely low computational costs (4M–21M FLOPs) and achieves significant performance gains over state of the art models in the low FLOP regime.

by Synced 2021-08-04 2

AI Computer Vision & Graphics Machine Learning & Data Science Research

Google’s Generative Video Compression Technique Outperforms Traditional Neural Video Compression

A team from Google Research presents a neural video compression method based on generative adversarial networks (GANs) that outperforms previous neural video compression methods and advances generative video compression research.

by Synced 2021-07-23 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

Graph Kernel Attention Transformers: Toward Expressive and Scalable Graph Processing

A research team from Google Brain, Columbia University and University of Oxford proposes Graph Kernel Attention Transformers (GKATs), a new class of graph neural network that achieves greater expressive power than SOTA GNNs while reducing computation burdens.

by Synced 2021-07-19 2

AI Computer Vision & Graphics Machine Learning & Data Science Research

‘QuanTaichi’ Quantized Simulation: High Visual Quality With Reduced Memory Cost

A research team from Taichi Graphics, MIT CSAIL, Zhejiang University, Tsinghua University and Kuaishou Technology introduces a programming language and compiler for quantized simulation that achieves both high performance and significantly reduced memory costs by enabling flexible and aggressive quantization.

by Synced 2021-07-08 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

Meet CLIPDraw: Text-to-Drawing Synthesis via Language-Image Encoders Without Model Training

A research team from Cross Compass Ltd, Massachusetts Institute of Technology, Tokyo Institute of Technology and University of Tokyo presents CLIPDraw, an algorithm that synthesizes drawings based on natural language input without the need for any training.

by Synced 2021-07-06 3

AI Computer Vision & Graphics Machine Learning & Data Science Popular Research

Facebook & UC Berkeley Substitute a Convolutional Stem to Dramatically Boost Vision Transformers’ Optimization Stability

A research team from Facebook AI and UC Berkeley finds a solution for vision transformers’ optimization instability problem by simply using a standard, lightweight convolutional stem for ViT models. The approach dramatically increases optimizer stability and improves peak performance without sacrificing computation efficiency.

by Synced 2021-07-01 2

AI Computer Vision & Graphics Machine Learning & Data Science Research

Video Swin Transformer Improves Speed-Accuracy Trade-offs, Achieves SOTA Results on Video Recognition Benchmarks

A research team from Microsoft Research Asia, University of Science and Technology of China, Huazhong University of Science and Technology, and Tsinghua University takes advantage of the inherent spatiotemporal locality of videos to present a pure-transformer backbone architecture for video recognition that leads to a better speed-accuracy trade-off.

by Synced 2021-06-02 2

AI Computer Vision & Graphics Machine Learning & Data Science Research

Google & Rutgers’ Aggregating Nested Transformers Yield Better Accuracy, Data Efficiency and Convergence

A research team from Google Cloud AI, Google Research and Rutgers University simplifies vision transformers’ complex design, proposing nested transformers (NesT) that simply stack basic transformer layers to process non-overlapping image blocks individually. The approach achieves superior ImageNet classification accuracy and improves model training efficiency.

by Synced 2021-05-19 9

AI Computer Vision & Graphics Research

Intelligent Graphic Design: Adobe’s Directional GAN Automates Image Content Generation for Marketing Campaigns

A research team from Adobe proposes Directional GAN (DGAN), a novel and simple approach for generating high-resolution images conditioned on expected semantic attributes, greatly simplifying the image content generating process for marketing campaigns, websites and banners.

by Synced 2021-04-30 2

AI Computer Vision & Graphics Machine Learning & Data Science Research

Yann LeCun Team’s Novel End-to-End Modulated Detector Captures Visual Concepts in Free-Form Text

A research team from NYU and Facebook proposes MDETR, an end-to-end modulated detector that identifies objects in images conditioned on a raw text query and is able to capture a long tail of visual concepts expressed in free-form text.

by Synced 2021-04-01 2

AI Computer Vision & Graphics Research

Google Research’s SOTA GNN ‘Reasons’ Interactions over Time to Boost Video Understanding

A research team from Google Research propose a message-passing graph neural network that can explicitly model spatio-temporal relations, use either implicitly or explicitly representations of objects, and generalize previous structured models for video understanding.

by Synced 2021-03-31 2

AI Computer Vision & Graphics Research

Google Research’s Novel High Efficient Neural Volumetric Representation Enables Real-Time View Synthesis

A Google Research team accelerates Neural Radiance Fields’ rendering procedure for view-synthesis tasks, enabling it to work in real-time while retaining its ability to represent fine geometric details and convincing view-dependent effects.

by Synced 2021-03-15 3

AI Computer Vision & Graphics Research

Yann LeCun Team’s Barlow Twins Method Boosts SSL in Image Representation via Redundancy Reduction

Yann LeCun and a team of researchers propose Barlow Twins, a method that learns self-supervised representations through a joint embedding of distorted images, with an objective function that can make the embedding vectors almost identical while reducing redundancy between their components.

by Synced 2021-03-12 2

AI Computer Vision & Graphics Machine Learning & Data Science Research

Oxford Novel Image Compression Method COIN: Better Than JPEG at Low Bitrates!

University of Oxford researchers propose COIN, a novel image compression method that stores the weights of an MLP overfitted to an image and outperforms JPEG at low bitrates even without entropy coding.

by Synced 2021-03-08 1

AI Computer Vision & Graphics Research

Meet Transformer in Transformer: A Visual Transformer That Captures Structural Information From Images

A team from Huawei, ISCAS and UCAS propose the novel Transformer-iN-Transformer (TNT) for modelling both patch-level and pixel-level representations.

by Synced 2021-03-02 28

AI Computer Vision & Graphics Research Share My Research

Fast Video Object Segmentation using the Global Context Module

A novel module that effectively and efficiently propagates information through an arbitrarily long video, with constant complexity w.r.t. number of frames and linear complexity w.r.t. resolution.

by Synced 2021-02-17 3

AI Computer Vision & Graphics Research

UC Berkeley & Google’s BoTNet Applies Self-Attention to CV Bottlenecks

Researchers from UC Berkeley and Google Research have introduced BoTNet, a “conceptually simple yet powerful” backbone architecture that boosts performance on computer vision (CV) tasks such as image classification, object detection and instance segmentation.