Latest Posts

AI Machine Learning & Data Science Research

Microsoft’s Neural Codec Language Models Synthesize High-Quality Personalized Speech From a 3-Second Sample

In the new paper Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers, a Microsoft research team presents VALL-E, the first language model-based text-to-speech (TTS) system with strong in-context learning. VALL-E achieves state-of-the-art personalized speech synthesis quality via prompting in a zero-shot setting.

AI Machine Learning & Data Science Research

Baidu Create 2022 Forum Details Strategy for Next-Level AI-Enhanced Creativity via Feedback-Driven Innovation

Baidu, Inc. today hosted its annual flagship developer conference Baidu Create 2022. In the meeting, Baidu offered an in-depth exploration of Baidu’s research and analysis of future technology trends, covering a range of emerging technologies including artificial intelligence, autonomous driving, intelligent search, quantum computing and AI scientific computing.

AI Machine Learning & Data Science Research

Google’s Masked Generative Transformers Achieve SOTA Text-To-Image Performance With Improved Efficiency

In the new paper Muse: Text-To-Image Generation via Masked Generative Transformers, a Google Research team introduces Muse, a transformer-based text-to-image synthesis model that leverages masked image modelling to achieve state-of-the-art performance while being significantly faster than diffusion or autoregressive models.

AI Machine Learning & Data Science Research

Stanford & Buffalo U Advance Language Modelling with State Space Models

In the new paper Hungry Hungry Hippos: Towards Language Modeling with State Space Models, Stanford University and State University of New York at Buffalo researchers explore the expressivity gap between state space models and transformer language model attention mechanisms and propose FlashConv to improve state space model training efficiency on modern hardware.

AI Machine Learning & Data Science Research

DeepMind & Google’s ML-Based GraphCast Outperforms the World’s Best Medium-Range Weather Forecasting System

In the new paper GraphCast: Learning Skillful Medium-Range Global Weather Forecasting, a research team from DeepMind and Google presents GraphCast, a machine-learning (ML)-based weather simulator that scales well with data and can generate a 10-day forecast in under 60 seconds. GraphCast outperforms the world’s most accurate deterministic operational medium-range weather forecasting system and betters existing ML-based benchmarks.

AI Computer Vision & Graphics Machine Learning & Data Science Research

OpenAI’s Point·E: Generating 3D Point Clouds From Complex Prompts in Minutes on a Single GPU

In the new paper Point-E: A System for Generating 3D Point Clouds from Complex Prompts, An OpenAI research team presents Point·E, a system for text-conditional synthesis of 3D point clouds that leverages diffusion models to generate diverse and complex 3D shapes conditioned on complex text prompts in minutes on a single GPU.

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s Structured Prompting Breaks In-Context Learning Length Limits, Scales to Thousands of Examples

In the new paper Structured Prompting: Scaling In-Context Learning to 1,000 Examples, a Microsoft Research team proposes structured prompting. The novel approach breaks through conventional in-context learning length limits, scaling to thousands of examples with reduced computation complexity and superior performance and stability.

AI Computer Vision & Graphics Machine Learning & Data Science Research

Maryland U & NYU’s Visual Exploration Reveals What Vision Transformers Learn

In the new paper What Do Vision Transformers Learn? A Visual Exploration, a research team from the University of Maryland and New York University uses large-scale feature visualizations from a wide range of vision transformers to gain insights into what they learn from images and how they differ from convolutional neural networks.

AI Machine Learning & Data Science Research

Microsoft’s E5 Text Embedding Model Tops the MTEB Benchmark With 40x Fewer Parameters

In the new paper Text Embeddings by Weakly-Supervised Contrastive Pre-training, a Microsoft research team introduces Embeddings from Bidirectional Encoder Representations (E5), a general-purpose text embedding model for tasks requiring a single-vector representation of texts and the first model to surpass the BM25 baseline on the BEIR retrieval benchmark under a zero-shot setting.

AI Machine Learning & Data Science Research

Google & Lund U’s Optimus Learned Optimization Architecture Efficiently Captures Complex Dependencies

In the new paper Transformer-Based Learned Optimization, a Google Research and Lund University team presents Optimus, an expressive neural network architecture for learned optimization that captures complex dependencies in the parameter space and achieves competitive results on real-world tasks and benchmark optimization problems.

AI Machine Learning & Data Science Nature Language Tech Research

DeepMind & UCL Fine-tune a 70B Parameter LM to Generate Statements Agreeable to Humans with Diverse Opinions

In the new paper Fine-tuning Language Models To Find Agreement Among Humans With Diverse Preferences, a research team from DeepMind and University College London fine-tunes a 70 billion parameter language model to generate statements that maximize agreement among a human group with diverse written opinions.

AI Machine Learning & Data Science Research

Alibaba’s VQRF Realizes a 100x Compression Rate, Reducing Volumetric Radiance Files to 1 MB

In the new paper Compressing Volumetric Radiance Fields to 1 MB, an Alibaba Group research team proposes vector quantized radiance fields (VQRF), a simple yet efficient framework for compressing volumetric radiance fields that achieves up to 100x storage reduction, reducing original grid model size to around 1 MB with negligible loss on rendering quality.

AI Machine Learning & Data Science Research

Stanford U & Google’s Convex Analytic Training Framework Improves the Understanding and Optimization of Transformers

In the new paper Convexifying Transformers: Improving Optimization and Understanding of Transformer Networks, a Stanford University and Google Research team provides a solid theoretical analysis of transformers’ fundamental mechanisms and introduces a novel convex analytic training framework for improving their optimization.

AI Machine Learning & Data Science Research

DeepMind Studies Process- vs Outcome-based Model Supervision, Significantly Reducing Reasoning Errors on Math Word Problems

In the new paper Solving Math Word Problems With Process- and Outcome-based Feedback, a DeepMind research team conducts the first comprehensive comparison between process- and outcome-based model supervision. The two approaches achieve comparable final-answer error rate improvements on math word problems, while the process-based method significantly reduces reasoning errors from 14.0 to just 3.4 percent.

AI Machine Learning & Data Science Research

No Images Are Needed! Allen AI’s CLOSE Learns to Complete Visual Tasks From Text Inputs Alone

In the new paper I Can’t Believe There’s No Images! Learning Visual Tasks Using only Language Data, an Allen Institute for Artificial Intelligence team proposes Cross Modal Transfer On Semantic Embeddings (CLOSE), an approach that learns high-level skills from textual data, then uses these skills to complete vision tasks without additional visual training data.

AI Machine Learning & Data Science Research

NeurIPS 2022 | MIT & Meta Enable Gradient Descent Optimizers to Automatically Tune Their Own Hyperparameters

In the NeurIPS 2022 Outstanding Paper Gradient Descent: The Ultimate Optimizer, MIT CSAIL and Meta researchers present a novel technique that enables gradient descent optimizers such as SGD and Adam to tune their hyperparameters automatically. The method requires no manual differentiation and can be stacked recursively to many levels.

AI Computer Vision & Graphics Machine Learning & Data Science Research

Moody Moving Faces: NVIDIA’s SPACEx Delivers High-Quality Portrait Animation with Controllable Expression

In the new paper SPACEx: Speech-driven Portrait Animation with Controllable Expression, an NVIDIA research team introduces SPACEx — a speech-driven portrait animation framework that generates high-resolution and expressive facial videos with control over subject pose, emotion and expression intensity.

AI Machine Learning & Data Science Research

‘MrsFormer’ Employs a Novel Multiresolution-Head Attention Mechanism to Cut Transformers’ Compute and Memory Costs

In the new paper Transformers with Multiresolution Attention Heads (currently under double-blind review for ICLR 2023), researchers propose MrsFormer, a novel transformer architecture that uses Multiresolution-head Attention to approximate output sequences and significantly reduces head redundancy without sacrificing accuracy.

AI Machine Learning & Data Science Research

UT Austin & Sony AI’s VIOLA Object-Centric Imitation Learning Method for Robot Manipulation Outperforms the SOTA by 45.8%

In the new paper VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors, researchers from the University of Texas at Austin and Sony AI present VIOLA (Visuomotor Imitation via Object-centric LeArning), an object-centric imitation learning model that endows imitation learning with awareness regarding objects and their interactions.

AI Machine Learning & Data Science Research

Almost 7X Cheaper! Colossal-AI’s Open Source Solution Accelerates AIGC at a Low-Cost Diffusion Pretraining and Hardware Fine-Tuning Can Be

Colossal-AI releases a complete open-source Stable Diffusion pretraining and fine-tuning solution that reduces the pretraining cost by 6.5 times, and the hardware cost of fine-tuning by 7 times, while simultaneously speeding up the processes! The fine-tuning task flow can also be conveniently completed on an RTX 2070/3050 PC.

AI Machine Learning & Data Science Nature Language Tech Popular Research

MIT, Northeastern & Technion Propose ROME for Efficient Locating and Editing of Factual Associations in GPT Models

In the new paper Locating and Editing Factual Associations in GPT, a research team from MIT CSAIL, Northeastern University and Technion IIT examines how information flows during knowledge recall in large autoregressive transformers and introduces Rank-One Model Editing (ROME), a simple, zero-shot principled model editor capable of locating and editing factual associations in such models.

AI Machine Learning & Data Science Research

Baidu’s Parallel Evoformer and Branch Parallelism Strategy Accelerates AlphaFold2 Training by 38.67%

In the new paper Efficient AlphaFold2 Training using Parallel Evoformer and Branch Parallelism, a Baidu research team presents a Parallel Evoformer and Branch Parallelism approach for efficient AlphaFold2 training. The novel strategy improves AlphaFold2 training speed by up to 38.67 percent without sacrificing performance.

AI Machine Learning & Data Science Research

Befuddling AI Go Systems: MIT, UC Berkeley & FAR AI’s Adversarial Policy Achieves a >99% Win Rate Against KataGo

In the new paper Adversarial Policies Beat Professional-Level Go AIs, a research team from MIT, UC Berkeley, and FAR AI employs a novel adversarial policy to attack the state-of-the-art AI Go system KataGo. The team believes theirs is the first successful end-to-end attack against an AI Go system playing at the level of a human professional.

AI Machine Learning & Data Science Research

Meta AI & Columbia U ‘Squeeze the Juice’ to Turn Bad Responses into Good Labels and Boost Dialogue Model Performance

In the new paper When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels, a research team from Meta AI and Columbia University proposes JUICER, a framework that effectively utilizes binary and textual human feedback to improve the conversational responses of dialogue models.