Category: Research

Technical review of the newest machine intelligence research.

AI Machine Learning & Data Science Research

PokéLLMon Triumph: Georgia Tech Unleashes the First LLM Agent Mastering Human-Level Skills in Pokemon Battles

In a new paper PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models, a Georgia Institute of Technology research team introduces PokéLLMon, a pioneering LLM-embodied agent demonstrating human-competent performance in tactical battle games.

AI Machine Learning & Data Science Research

Google and UT Austin’s Game-Changing Approach Distills Vision-Language Models on Millions of Videos

In a new paper Distilling Vision-Language Models on Millions of Videos, a research team introduces a straightforward yet highly effective method to adapt image-based vision-language models to video. The approach involves generating high-quality pseudo-captions for millions of videos, outperforming state-of-the-art methods across various video-language benchmarks.

AI Machine Learning & Data Science Research

Nature’s New Breakthrough: Control Human Language Network via Large Language Model

In a new breakthrough paper Driving and suppressing the human language network using large language models, a research team from Massachusetts Institute of Technology, MIT-IBM Watson AI Lab, University of Minnesota and Harvard University leverages a GPT-based encoding model to identify sentences predicted to elicit specific responses within the human language network.

AI Machine Learning & Data Science Research

Google’s AMIE Marks A Significant Milestone Toward Conversational Diagnostic AI

In a new paper Towards Conversational Diagnostic AI, a research team from Google Research and Google DeepMind introduces AMIE (Articulate Medical Intelligence Explorer), an LLM-based AI system meticulously optimized for clinical history-taking and diagnostic dialogues, showcasing superior diagnostic accuracy and outperforming primary care physicians (PCPs).

AI Machine Learning & Data Science Nature Language Tech Research

LangSplat: Turbocharging 3D Language Fields with a Mind-Blowing 199x Speed Boost

In a new paper LangSplat: 3D Language Gaussian Splattin, a research team from Tsinghua University and Harvard University introduces LangSplat, a groundbreaking 3D Gaussian Splatting-based method designed for 3D language fields, which surpasses the state-of-the-art LERF method while boasting a remarkable speed improvement of 199 times.

AI Machine Learning & Data Science Research

Gemini: Bridging Tomorrow’s Deep Neural Network Frontiers with Unrivaled Chiplet Accelerator Mastery

A research team introduces Gemini, an innovative framework, focusing on both architecture and mapping co-exploration, aims to propel large-scale DNN chiplet accelerators to new heights, achieving an impressive average performance improvement of 1.98× and an energy efficiency boost of 1.41× compared to the state-of-the-art Simba architecture.

AI Machine Learning & Data Science Nature Language Tech Research

A Robot Chemist Driven by GPT-4 Made Its Debut in Nature: Autonomously Designs Reactions and Performs Complex Experiments

In a new paper Autonomous chemical research with large language models, a research team from Carnegie Mellon University and Emerald Cloud Lab introduces an innovative LLMs-Powered system named Coscientist, which autonomously designs, plans, and executes complex scientific experiments, marking a significant leap forward in the integration of laboratory automation technologies with powerful language models.

AI Machine Learning & Data Science Research

Microsoft’s TaskWeaver: Empowering Intelligent Conversational Agents for Handling Domain-Specific Complex Tasks

A Microsoft research team introduces TaskWeaver, a cutting-edge, code-first framework designed to empower LLM-powered autonomous agents. TaskWeaver offers a potent and flexible platform for constructing intelligent conversational agents capable of handling complex tasks and seamlessly adapting to domain-specific scenarios.

AI Machine Learning & Data Science Research

Spatial-Temporal Innovation: STLVQE Redefines Real-Time Video Enhancement for an Unmatched Viewing Experience

A paper titled “Online Video Quality Enhancement with Spatial-Temporal Look-up Tables” introduces a novel method, STLVQE. This research, conducted by a team from Tongji University and Microsoft Research Asia, pioneers the exploration of the online video quality enhancement problem and presents the first method achieving real-time processing speed.

AI Machine Learning & Data Science Research

DeepMind’s DiLoCo Revolutionizes Language Model Training with 500× Less Communication

In a new paper DiLoCo: Distributed Low-Communication Training of Language Models, a Google DeepMind research team presents Distributed Low-Communication (DiLoCo). DiLoCo employs a distributed optimization algorithm that facilitates the training of language models on islands of poorly connected devices, surpassing the performance of fully synchronous models while reducing communication by 500 times.

AI Machine Learning & Data Science Research

Microsoft Orca 2’s Triumph: Comparable or Superior Performance to Models 5-10x Its Size in Mastering Reasoning Tasks

Microsoft has recently unveiled Orca 2 in a new paper titled “Orca 2: Teaching Small Language Models How to Reason.” to explore how enhanced training signals can augment the reasoning abilities of smaller language models. Notably, Orca 2 surpasses models of similar size, achieving performance levels comparable to or better than models 5-10 times larger.

AI Computer Vision & Graphics Machine Learning & Data Science Research

Adobe & ANU’s LRM Reconstructs Models For Single Image to 3D in 5s

In a new paper LRM: Large Reconstruction Model for Single Image to 3D, a research team from Adobe Research and Australian National Univerisity introduces an innovative Large Reconstruction Model (LRM). This groundbreaking model has the remarkable ability to predict a 3D model of an object from a single input image in a mere 5 seconds.

AI Machine Learning & Data Science Research

Google’s E3 TTS Provides Effortless Approach to High-Quality Audio Synthesis Through Diffusion Models

In a new paper E3 TTS: Easy End-to-End Diffusion-based Text to Speech, a Google research team proposes Easy End-to-End Diffusion-based Text to Speech. This streamlined and efficient text-to-speech model hinges solely on diffusion to preserve temporal structure, allowing it to accept plain text as input and generate audio waveforms directly.

AI Machine Learning & Data Science Research

Apple Repurposes Large Language Models for Reinforcement Learning challenges in Embodied AI

An Apple research team presents Large LAnguage model Reinforcement Learning Policy (LLaRP). LLaRP effectively repurposes LLMs for Reinforcement Learning (RL) challenges within the realm of Embodied Artificial Intelligence (AI), achieving a remarkable 1.7 times higher success rate compared to other established baselines and zero-shot LLM applications.

AI Machine Learning & Data Science Research

Microsoft Azure’s Idea2Img: Enabling Automatic Image Design and Generation with Enhanced Image Quality

A Microsoft Azure AI research team introduces “Idea2Img” in their paper, “Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation.”, which leverages the capabilities of GPT-4V(ision) to revolutionize the process of automatic image design and generation with enhanced image quality.