Category: Research

Technical review of the newest machine intelligence research.

AI Machine Learning & Data Science Research

Google & Columbia U’s Mnemosyne: Learning to Train Transformers With Transformers

In the new paper Mnemosyne: Learning to Train Transformers with Transformers, a research team from Google and Columbia University presents Mnemosyne Optimizer, a learning-to-learn system for training entire neural network architectures without any task-specific optimizer tuning.

AI Machine Learning & Data Science Research

Genius or Subpar AI Mathematician? New Study Questions ChatGPT’s Mathematical Capabilities

In the new paper Mathematical Capabilities of ChatGPT, an international research team tests ChatGPT’s mathematical capabilities and evaluates its suitability as an assistant to professional mathematicians. The team concludes that despite the glowing reviews in mainstream media, ChatGPT’s mathematical abilities “are significantly below those of an average mathematics graduate student.”

AI Machine Learning & Data Science Nature Language Tech Research

Stanford U’s DetectGPT Takes a Curvature-Based Approach to LLM-Generated Text Detection

In the new paper DetectGPT: Zero-Shot Machine-Generated Text Detection Using Probability Curvature, a Stanford University research team presents DetectGPT, a zero-shot machine-generated text detection algorithm that uses probability curvature to predict whether a candidate passage was generated by a large language model.

AI Machine Learning & Data Science Research

AI Jam Session: Google & Sorbonne U’s MusicLM Achieves SOTA Performance on High-Fidelity Music Generation from Text

In the new paper MusicLM: Generating Music From Text, a Google Research and Sorbonne University team presents MusicLM, a model for generating high-fidelity music that can be conditioned on both text and melody. MusicLM surpasses baselines in both its audio quality and adherence to the text descriptions.

AI Machine Learning & Data Science Research

Microsoft & UCLA Introduce ClimaX: A Foundation Model for Climate and Weather Modelling

In the new paper ClimaX: A Foundation Model for Weather and Climate, a team from Microsoft Autonomous Systems and Robotics Research, Microsoft Research AI4Science and the University of California at Los Angeles presents ClimaX, a foundation model for weather and climate that can be efficiently adapted for general-purpose tasks related to the Earth’s atmosphere.

AI Machine Learning & Data Science Research

Stanford U’s Brain-Computer Interface Enables Stroke and ALS Patients to ‘Speak’ 62 Words per Minute

A Stanford University research team presents a brain-computer interface for translating speech-related neural activity into text (speech BCI) in the new paper A High-performance Speech Neuroprosthesis. Theirs is the first speech BCI to record impulse activity from intracortical microelectrode arrays and could benefit people unable to produce clear utterances due to diseases such as stroke and ALS.

AI Machine Learning & Data Science Research

Oxford U’s Deep Double Duelling Q-Learning Translates Trading Signals Into SOTA Trading Strategies

In the new paper Asynchronous Deep Double Duelling Q-Learning for Trading-Signal Execution in Limit Order Book Markets, an Oxford University research team introduces Deep Duelling Double Q-learning with the APEX architecture to train a trading agent to translate predictive signals into optimal limit order trading strategies.

AI Machine Learning & Data Science Research

Forget About Catastrophic Forgetting: Google’s Continual HyperTransformer Enables Efficient Continual Few-Shot Learning

In the new paper Continual Few-Shot Learning Using HyperTransformers, a Google Research team proposes Continual HyperTransformer, which modifies the recently published HyperTransformer few-shot learning method to sequentially update a convolutional neural network’s weights based on the information in a new task without forgetting the knowledge it learned from previous tasks.

AI Machine Learning & Data Science Research

Meet Tracr: DeepMind & ETH Zurich’s Novel Interpretability Tool Compiles Human-Readable Code to Transformers’ Weights

In the new paper Tracr: Compiled Transformers as a Laboratory for Interpretability, a research team from ETH Zurich and DeepMind presents Tracr, a compiler that addresses the absence of ground truth explanations in deep neural network models by “compiling” human readable code to the weights of a transformer model.

AI Machine Learning & Data Science Research

BERT-Style Pretraining on Convnets? Peking U, ByteDance & Oxford U’s Sparse Masked Modelling With Hierarchy Leads the Way

In the new paper Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling, a research team from Peking University, ByteDance, and the University of Oxford presents Sparse Masked Modelling with Hierarchy (SparK), the first BERT-style pretraining approach that can be used on convolutional models without any backbone modifications.

AI Machine Learning & Data Science Research

Microsoft’s Neural Codec Language Models Synthesize High-Quality Personalized Speech From a 3-Second Sample

In the new paper Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers, a Microsoft research team presents VALL-E, the first language model-based text-to-speech (TTS) system with strong in-context learning. VALL-E achieves state-of-the-art personalized speech synthesis quality via prompting in a zero-shot setting.

AI Machine Learning & Data Science Research

Baidu Create 2022 Forum Details Strategy for Next-Level AI-Enhanced Creativity via Feedback-Driven Innovation

Baidu, Inc. today hosted its annual flagship developer conference Baidu Create 2022. In the meeting, Baidu offered an in-depth exploration of Baidu’s research and analysis of future technology trends, covering a range of emerging technologies including artificial intelligence, autonomous driving, intelligent search, quantum computing and AI scientific computing.

AI Machine Learning & Data Science Research

Google’s Masked Generative Transformers Achieve SOTA Text-To-Image Performance With Improved Efficiency

In the new paper Muse: Text-To-Image Generation via Masked Generative Transformers, a Google Research team introduces Muse, a transformer-based text-to-image synthesis model that leverages masked image modelling to achieve state-of-the-art performance while being significantly faster than diffusion or autoregressive models.

AI Machine Learning & Data Science Research

Stanford & Buffalo U Advance Language Modelling with State Space Models

In the new paper Hungry Hungry Hippos: Towards Language Modeling with State Space Models, Stanford University and State University of New York at Buffalo researchers explore the expressivity gap between state space models and transformer language model attention mechanisms and propose FlashConv to improve state space model training efficiency on modern hardware.

AI Machine Learning & Data Science Research

DeepMind & Google’s ML-Based GraphCast Outperforms the World’s Best Medium-Range Weather Forecasting System

In the new paper GraphCast: Learning Skillful Medium-Range Global Weather Forecasting, a research team from DeepMind and Google presents GraphCast, a machine-learning (ML)-based weather simulator that scales well with data and can generate a 10-day forecast in under 60 seconds. GraphCast outperforms the world’s most accurate deterministic operational medium-range weather forecasting system and betters existing ML-based benchmarks.

AI Computer Vision & Graphics Machine Learning & Data Science Research

OpenAI’s Point·E: Generating 3D Point Clouds From Complex Prompts in Minutes on a Single GPU

In the new paper Point-E: A System for Generating 3D Point Clouds from Complex Prompts, An OpenAI research team presents Point·E, a system for text-conditional synthesis of 3D point clouds that leverages diffusion models to generate diverse and complex 3D shapes conditioned on complex text prompts in minutes on a single GPU.

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s Structured Prompting Breaks In-Context Learning Length Limits, Scales to Thousands of Examples

In the new paper Structured Prompting: Scaling In-Context Learning to 1,000 Examples, a Microsoft Research team proposes structured prompting. The novel approach breaks through conventional in-context learning length limits, scaling to thousands of examples with reduced computation complexity and superior performance and stability.

AI Computer Vision & Graphics Machine Learning & Data Science Research

Maryland U & NYU’s Visual Exploration Reveals What Vision Transformers Learn

In the new paper What Do Vision Transformers Learn? A Visual Exploration, a research team from the University of Maryland and New York University uses large-scale feature visualizations from a wide range of vision transformers to gain insights into what they learn from images and how they differ from convolutional neural networks.

AI Machine Learning & Data Science Research

Microsoft’s E5 Text Embedding Model Tops the MTEB Benchmark With 40x Fewer Parameters

In the new paper Text Embeddings by Weakly-Supervised Contrastive Pre-training, a Microsoft research team introduces Embeddings from Bidirectional Encoder Representations (E5), a general-purpose text embedding model for tasks requiring a single-vector representation of texts and the first model to surpass the BM25 baseline on the BEIR retrieval benchmark under a zero-shot setting.

AI Machine Learning & Data Science Research

Google & Lund U’s Optimus Learned Optimization Architecture Efficiently Captures Complex Dependencies

In the new paper Transformer-Based Learned Optimization, a Google Research and Lund University team presents Optimus, an expressive neural network architecture for learned optimization that captures complex dependencies in the parameter space and achieves competitive results on real-world tasks and benchmark optimization problems.

AI Machine Learning & Data Science Nature Language Tech Research

DeepMind & UCL Fine-tune a 70B Parameter LM to Generate Statements Agreeable to Humans with Diverse Opinions

In the new paper Fine-tuning Language Models To Find Agreement Among Humans With Diverse Preferences, a research team from DeepMind and University College London fine-tunes a 70 billion parameter language model to generate statements that maximize agreement among a human group with diverse written opinions.

AI Machine Learning & Data Science Research

Alibaba’s VQRF Realizes a 100x Compression Rate, Reducing Volumetric Radiance Files to 1 MB

In the new paper Compressing Volumetric Radiance Fields to 1 MB, an Alibaba Group research team proposes vector quantized radiance fields (VQRF), a simple yet efficient framework for compressing volumetric radiance fields that achieves up to 100x storage reduction, reducing original grid model size to around 1 MB with negligible loss on rendering quality.

AI Machine Learning & Data Science Research

Stanford U & Google’s Convex Analytic Training Framework Improves the Understanding and Optimization of Transformers

In the new paper Convexifying Transformers: Improving Optimization and Understanding of Transformer Networks, a Stanford University and Google Research team provides a solid theoretical analysis of transformers’ fundamental mechanisms and introduces a novel convex analytic training framework for improving their optimization.

AI Machine Learning & Data Science Research

DeepMind Studies Process- vs Outcome-based Model Supervision, Significantly Reducing Reasoning Errors on Math Word Problems

In the new paper Solving Math Word Problems With Process- and Outcome-based Feedback, a DeepMind research team conducts the first comprehensive comparison between process- and outcome-based model supervision. The two approaches achieve comparable final-answer error rate improvements on math word problems, while the process-based method significantly reduces reasoning errors from 14.0 to just 3.4 percent.

AI Machine Learning & Data Science Research

No Images Are Needed! Allen AI’s CLOSE Learns to Complete Visual Tasks From Text Inputs Alone

In the new paper I Can’t Believe There’s No Images! Learning Visual Tasks Using only Language Data, an Allen Institute for Artificial Intelligence team proposes Cross Modal Transfer On Semantic Embeddings (CLOSE), an approach that learns high-level skills from textual data, then uses these skills to complete vision tasks without additional visual training data.

AI Machine Learning & Data Science Research

NeurIPS 2022 | MIT & Meta Enable Gradient Descent Optimizers to Automatically Tune Their Own Hyperparameters

In the NeurIPS 2022 Outstanding Paper Gradient Descent: The Ultimate Optimizer, MIT CSAIL and Meta researchers present a novel technique that enables gradient descent optimizers such as SGD and Adam to tune their hyperparameters automatically. The method requires no manual differentiation and can be stacked recursively to many levels.