In a paper currently under double-blind review for ICLR 2022, researchers propose StyleNeRF, a 3D-aware generative model that can synthesize high-resolution images at interactive rates while preserving high-quality 3D consistency, and can even generalize to unseen views with control on styles and poses.
A research team from the University of Southern California and Google proposes TOME, a “mention memory” approach to factual knowledge extraction for NLU tasks. A transformer model with attention over a semi-parametric representation of the entire Wikipedia text corpus, TOME can extract information without supervision and achieves strong performance on multiple open-domain question answering benchmarks.
A Google Research team conducts a systematic exploration comprising more than 4800 experiments on Vision Transformers, MLP-Mixers and ResNets with parameters ranging from 10 million to 10 billion, evaluated on more than 20 downstream image recognition tasks, aiming to capture the nonlinear relationships between performance on upstream and downstream tasks.
A NVIDIA and Aalto University research team presents StyleGAN3, a novel generative adversarial network (GAN) architecture where the exact sub-pixel position of each feature is exclusively inherited from the underlying coarse features, enabling a more natural transformation hierarchy and advancing GAN-based animation generation.
A research team proposes ConvMixer, an extremely simple model designed to support the argument that the impressive performance of vision transformers (ViTs) is mainly attributable to their use of patches as the input representation. The study shows that ConvMixer can outperform ViTs, MLP-Mixers and classical vision models.
An Apple research team performs a comparative analysis on a contrastive self-supervised learning (SSL) algorithm (SimCLR) and a supervised learning (SL) approach for simple image data in a common architecture, shedding light on the similarities and dissimilarities in their learned visual representation patterns.
In the paper Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers, a research team from New York University and the University of North Carolina at Chapel Hill uses centered kernel alignment (CKA) to measure the similarity of representations across layers and explore how fine-tuning changes transformers’ learned representations.
In the paper Introducing Symmetries to Black Box Meta Reinforcement Learning, a research team from DeepMind and The Swiss AI Lab IDSIA explores the role of symmetries in meta generalization and shows that introducing more symmetries to black-box meta-learners can improve their ability to generalize to unseen action and observation spaces, tasks, and environments.
A Google AI research team explores zero-label learning (training with synthetic data only) in natural language processing, and introduces Unsupervised Data Generation (UDG), a training data creation procedure designed to synthesize high-quality training data without human annotations.
A research team from Carnegie Mellon University, Google Brain and UC Berkeley proposes a robust predictable control (RPC) method for learning reinforcement learning policies that use fewer bits of information. This simple and theoretically-justified algorithm achieves much tighter compression, is more robust, and generalizes better than prior methods, achieving up to 5× higher rewards than a standard information bottleneck.
MIT researchers present an automated, objective and transparent data-driven method for measuring media bias. The study analyses roughly a million articles from about a hundred newspapers for bias on various news topics, maps the newspapers into a two-dimensional media bias landscape, and shows that the data-driven results agree well with human-judgement classifications.
In the paper ReGen: Reinforcement Learning for Text and Knowledge Base Generation Using Pretrained Language Models, IBM researchers present ReGen, a bidirectional generation of text and graph that leverages reinforcement learning to push the performance of text-to-graph and graph-to-text generation tasks to a higher level.
A research team from Stanford University introduces BEHAVIOR, a benchmark for embodied AI with 100 realistic, diverse and complex everyday household activities in simulation. BEHAVIOR addresses challenges such as definition, instantiation in a simulator, and evaluation; and pushes the state-of-the-art by adding new types of state changes.
A Nvidia research team presents Isaac Gym — a high-performance robotics simulation platform that runs an end-to-end GPU accelerated training pipeline. Compared to conventional RL training methods that use a CPU-based simulator and GPU for neural networks, Isaac Gym achieves training speedups of 2-3 orders of magnitude on continuous control tasks.