Author: Synced

Machine Intelligence | Technology & Industry | Information & Analysis
AI Machine Learning & Data Science Research

NeurIPS 2022 | MIT & Meta Enable Gradient Descent Optimizers to Automatically Tune Their Own Hyperparameters

In the NeurIPS 2022 Outstanding Paper Gradient Descent: The Ultimate Optimizer, MIT CSAIL and Meta researchers present a novel technique that enables gradient descent optimizers such as SGD and Adam to tune their hyperparameters automatically. The method requires no manual differentiation and can be stacked recursively to many levels.

AI Computer Vision & Graphics Machine Learning & Data Science Research

Moody Moving Faces: NVIDIA’s SPACEx Delivers High-Quality Portrait Animation with Controllable Expression

In the new paper SPACEx: Speech-driven Portrait Animation with Controllable Expression, an NVIDIA research team introduces SPACEx — a speech-driven portrait animation framework that generates high-resolution and expressive facial videos with control over subject pose, emotion and expression intensity.

AI Machine Learning & Data Science Research

‘MrsFormer’ Employs a Novel Multiresolution-Head Attention Mechanism to Cut Transformers’ Compute and Memory Costs

In the new paper Transformers with Multiresolution Attention Heads (currently under double-blind review for ICLR 2023), researchers propose MrsFormer, a novel transformer architecture that uses Multiresolution-head Attention to approximate output sequences and significantly reduces head redundancy without sacrificing accuracy.

AI Machine Learning & Data Science Research

UT Austin & Sony AI’s VIOLA Object-Centric Imitation Learning Method for Robot Manipulation Outperforms the SOTA by 45.8%

In the new paper VIOLA: Imitation Learning for Vision-Based Manipulation with Object Proposal Priors, researchers from the University of Texas at Austin and Sony AI present VIOLA (Visuomotor Imitation via Object-centric LeArning), an object-centric imitation learning model that endows imitation learning with awareness regarding objects and their interactions.

AI Machine Learning & Data Science Research

Almost 7X Cheaper! Colossal-AI’s Open Source Solution Accelerates AIGC at a Low-Cost Diffusion Pretraining and Hardware Fine-Tuning Can Be

Colossal-AI releases a complete open-source Stable Diffusion pretraining and fine-tuning solution that reduces the pretraining cost by 6.5 times, and the hardware cost of fine-tuning by 7 times, while simultaneously speeding up the processes! The fine-tuning task flow can also be conveniently completed on an RTX 2070/3050 PC.

AI Machine Learning & Data Science Nature Language Tech Popular Research

MIT, Northeastern & Technion Propose ROME for Efficient Locating and Editing of Factual Associations in GPT Models

In the new paper Locating and Editing Factual Associations in GPT, a research team from MIT CSAIL, Northeastern University and Technion IIT examines how information flows during knowledge recall in large autoregressive transformers and introduces Rank-One Model Editing (ROME), a simple, zero-shot principled model editor capable of locating and editing factual associations in such models.

AI Machine Learning & Data Science Research

Baidu’s Parallel Evoformer and Branch Parallelism Strategy Accelerates AlphaFold2 Training by 38.67%

In the new paper Efficient AlphaFold2 Training using Parallel Evoformer and Branch Parallelism, a Baidu research team presents a Parallel Evoformer and Branch Parallelism approach for efficient AlphaFold2 training. The novel strategy improves AlphaFold2 training speed by up to 38.67 percent without sacrificing performance.

AI Machine Learning & Data Science Research

Befuddling AI Go Systems: MIT, UC Berkeley & FAR AI’s Adversarial Policy Achieves a >99% Win Rate Against KataGo

In the new paper Adversarial Policies Beat Professional-Level Go AIs, a research team from MIT, UC Berkeley, and FAR AI employs a novel adversarial policy to attack the state-of-the-art AI Go system KataGo. The team believes theirs is the first successful end-to-end attack against an AI Go system playing at the level of a human professional.

AI Machine Learning & Data Science Research

Meta AI & Columbia U ‘Squeeze the Juice’ to Turn Bad Responses into Good Labels and Boost Dialogue Model Performance

In the new paper When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels, a research team from Meta AI and Columbia University proposes JUICER, a framework that effectively utilizes binary and textual human feedback to improve the conversational responses of dialogue models.

AI Machine Learning & Data Science Research

Google Introduces RankT5: A Fine-Tuned T5 Model That Boosts Text Ranking and Zero-Shot Performance

In the new paper RankT5: Fine-Tuning T5 for Text Ranking with Ranking Losses, a Google Research team presents RankT5, which employs pretrained T5 models for text ranking with various ranking losses to directly optimize ranking performance. RankT5 models more natively support text ranking by outputting real numbers rather than text tokens.

AI Machine Learning & Data Science Research

CMU Takes a Big Step Toward Real-Time Realistic Video Generation Based on Language Descriptions

In the new paper Towards Real-Time Text2Video via CLIP-Guided, Pixel-Level Optimization, researchers from Carnegie Mellon University leverage CLIP-guided, pixel-level optimization to generate 720p resolution videos from natural language descriptions at a rate of one-to-two frames per second — taking a big step towards a real-time text-to-video system.

AI Machine Learning & Data Science Research

DeepMind Study Shows That Language Models Can Learn From Explanations in Context Even Without Tuning

In the new paper Can Language Models Learn From Explanations in Context?, DeepMind researchers investigate how different types of explanations, instructions, and controls affect language models’ zero- and few-shot performance and how such explanations can support in-context learning for large language models on challenging tasks.

AI Machine Learning & Data Science Research

Google & Stanford Team Applies Chain-of-Thought Prompting to Surpass Human Performance on Challenging BIG-Bench Tasks

In the new paper Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them, a Google Research and Stanford University team applies chain-of-thought (CoT) prompting — a series of intermediate reasoning steps — to 23 BIG-Bench tasks on which language models have failed to outperform the average human rater. The proposed approach enables models to surpass human performance on 17 of the 23 tasks.

AI Machine Learning & Data Science Research

Wider, Not Deeper: Cambridge, Oxford & ICL Challenge Conventional Transformer Design Approaches

In the new paper Wide Attention Is The Way Forward For Transformers, a research team from the University of Cambridge, Imperial College London, and the University of Oxford challenges the commonly held belief that deeper is better for transformer architectures, demonstrating that wider layers result in superior performance on natural language processing tasks.

AI Machine Learning & Data Science Research

Embedding Training With 1% GPU Memory and 100 Times Less Budget, an Open Source Solution for Super-Large Recommendation Model Training on a Single GPU

Colossal-AI has successfully used a heterogeneous training strategy to increase the number of NLP model training parameters capacity by hundreds of times at the same hardware. And experiment results show that it only needs to keep 1~5% of the embedding parameters in the GPU, and is still able to maintain excellent end-to-end training speed.

AI Machine Learning & Data Science Research

Stanford U & Google Brain’s Classifier-Free Guidance Model Diffusion Technique Reduces Sampling Steps by 256x

In the new paper On Distillation of Guided Diffusion Models, researchers from Google Brain and Stanford University propose a novel approach for distilling classifier-free guided diffusion models with high sampling efficiency. The resulting models achieve performance comparable to the original model but with sampling steps reduced by up to 256 times.

AI Machine Learning & Data Science Nature Language Tech Research

‘Ask Me Anything’: Stanford U, Numbers Station & UW Madison’s Novel Prompting Strategy Enables LLMs With 30x Fewer Parameters to Outperform Few-Shot GPT3-175B

In the new paper Ask Me Anything: A Simple Strategy for Prompting Language Models, a research team from Stanford University, Numbers Station, and the University of Wisconsin-Madison presents Ask Me Anything Prompting (AMA), a simple large language model prompting strategy that enables a 30x smaller language model to outperform few-shot GPT3-175B.

AI Computer Vision & Graphics Machine Learning & Data Science Research

Maximizing FLOPS Utilization: DeepMind & NYU Propose Efficiency Evaluations for Visual Pretraining Methods

In the new paper Where Should I Spend My FLOPS? Efficiency Evaluations of Visual Pre-training Methods, DeepMind and NYU Center for Neural Systems researchers introduce computational efficiency evaluation approaches designed to aid in the selection of optimal methods, datasets and models for pretraining visual tasks on a fixed FLOP budget.

AI Machine Learning & Data Science Research

UNC Chapel Hill’s Textless Vision-Language Transformer: Comparable Performance to Text-Based Approaches but 28x Faster

In the new paper TVLT: Textless Vision-Language Transformer, researchers from UNC Chapel Hill present the Textless Vision-Language Transformer (TVLT) for vision-and-language representation learning. TVLT uses only raw visual and audio inputs and performs comparably to its text-based counterparts but requires only 1/3 the parameters and achieves 28x faster inference speeds.

AI Machine Learning & Data Science Research

DeepMind, Oxford U, IDSIA, Mila & Purdue U’s General Neural Algorithmic Learner Matches Task-Specific Expert Performance

In the new paper A Generalist Neural Algorithmic Learner, a research team from DeepMind, University of Oxford, IDSIA, Mila, and Purdue University presents a novel generalist neural algorithmic learner — a single graph neural network (GNN) capable of solving various classical algorithms at single-task expert level.

AI Machine Learning & Data Science Research

Transformers on Edge Devices? Monash U’s Energy-Saving Attention With Linear Complexity Reduces Compute Cost by 73%

In the new paper EcoFormer: Energy-Saving Attention with Linear Complexity, a Monash University research team presents EcoFormer, an attention mechanism with linear complexity that replaces expensive multiply-accumulate operations with simple accumulations and achieves a 73 percent energy footprint reduction on ImageNet.

AI Machine Learning & Data Science Nature Language Tech Research

Google Brain’s Vec2Text Models for Sentence Generation Excel in Universality, Diversity, Fluency & Semantic Structure

In the new paper Vec2text With Round-Trip Translations, Google Brain researchers explore large language models’ capabilities for generating arbitrary natural language text from inputs of fixed-size vectors — a vec2text setting — and propose a simple data augmentation approach based on round-trip translations to improve vec2text model performance.

AI Machine Learning & Data Science Research

DeepMind’s ‘Expert-Aware’ Data Augmentation Technique Enables Data-Efficient Learning from Parametric Experts

The new DeepMind paper Data Augmentation for Efficient Learning from Parametric Experts proposes Augmented Policy Cloning (APC), a simple yet effective data-augmentation approach designed to support data-efficient learning from parametric experts. The method significantly improves data efficiency across various control and reinforcement learning settings.

AI Machine Learning & Data Science Nature Language Tech Research

Peking U & Microsoft’s Knowledge Attribution Method Enables Editing Factual Knowledge in Pretrained Transformers Without Fine-Tuning

In the new paper Knowledge Neurons in Pretrained Transformers, a research team from Peking University and Microsoft Research introduces a knowledge attribution method that identifies the neurons that store factual knowledge in pretrained transformers and leverages these neurons to edit factual knowledge in transformers without any fine-tuning.