Tag: large language model

AI Machine Learning & Data Science Research

Decoding Code Execution: How DeepMind’s NExT Empowers AI Reasoning

In a new paper NExT: Teaching Large Language Models to Reason about Code Execution, a Google DeepMind research team proposes Naturalized Execution Tuning (NExT), a method aims to equip LLMs with the ability to scrutinize program execution traces and deduce runtime behaviors through chain-of-thought (CoT) rationales.

AI Machine Learning & Data Science Research

Stanford’s VideoAgent Achieves New SOTA of Long-Form Video Understanding via Agent-Based System

In a new paper VideoAgent: Long-form Video Understanding with Large Language Model as Agent, a Stanford University research team introduces VideoAgent, an innovative approach simulates human comprehension of long-form videos through an agent-based system, showcasing superior effectiveness and efficiency compared to current state-of-the-art methods.

AI Machine Learning & Data Science Research

Embracing the Era of 1-Bit LLMs: Microsoft & UCAS’s BitNet b1.58 Redefines Efficiency

In a new paper The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits, a research team introduces a new variant of 1-bit LLMs called BitNet b1.58, which preserves the advantages of the original 1-bit BitNet while ushering in a novel computational paradigm that significantly enhances cost-effectiveness in terms of latency, memory usage, throughput, and energy consumption.

AI Machine Learning & Data Science Research

Nature’s New Breakthrough: Control Human Language Network via Large Language Model

In a new breakthrough paper Driving and suppressing the human language network using large language models, a research team from Massachusetts Institute of Technology, MIT-IBM Watson AI Lab, University of Minnesota and Harvard University leverages a GPT-based encoding model to identify sentences predicted to elicit specific responses within the human language network.

AI Machine Learning & Data Science Research

Google’s AMIE Marks A Significant Milestone Toward Conversational Diagnostic AI

In a new paper Towards Conversational Diagnostic AI, a research team from Google Research and Google DeepMind introduces AMIE (Articulate Medical Intelligence Explorer), an LLM-based AI system meticulously optimized for clinical history-taking and diagnostic dialogues, showcasing superior diagnostic accuracy and outperforming primary care physicians (PCPs).

AI Machine Learning & Data Science Nature Language Tech Research

A Robot Chemist Driven by GPT-4 Made Its Debut in Nature: Autonomously Designs Reactions and Performs Complex Experiments

In a new paper Autonomous chemical research with large language models, a research team from Carnegie Mellon University and Emerald Cloud Lab introduces an innovative LLMs-Powered system named Coscientist, which autonomously designs, plans, and executes complex scientific experiments, marking a significant leap forward in the integration of laboratory automation technologies with powerful language models.

AI Machine Learning & Data Science Research

Microsoft’s TaskWeaver: Empowering Intelligent Conversational Agents for Handling Domain-Specific Complex Tasks

A Microsoft research team introduces TaskWeaver, a cutting-edge, code-first framework designed to empower LLM-powered autonomous agents. TaskWeaver offers a potent and flexible platform for constructing intelligent conversational agents capable of handling complex tasks and seamlessly adapting to domain-specific scenarios.

AI Machine Learning & Data Science Research

DeepMind’s DiLoCo Revolutionizes Language Model Training with 500× Less Communication

In a new paper DiLoCo: Distributed Low-Communication Training of Language Models, a Google DeepMind research team presents Distributed Low-Communication (DiLoCo). DiLoCo employs a distributed optimization algorithm that facilitates the training of language models on islands of poorly connected devices, surpassing the performance of fully synchronous models while reducing communication by 500 times.

AI Machine Learning & Data Science Research

Apple Repurposes Large Language Models for Reinforcement Learning challenges in Embodied AI

An Apple research team presents Large LAnguage model Reinforcement Learning Policy (LLaRP). LLaRP effectively repurposes LLMs for Reinforcement Learning (RL) challenges within the realm of Embodied Artificial Intelligence (AI), achieving a remarkable 1.7 times higher success rate compared to other established baselines and zero-shot LLM applications.

AI Machine Learning & Data Science Nature Language Tech Research

The Reversal Curse: Uncovering the Intriguing Limits of Language Models

In a new paper titled “The Reversal Curse: LLMs trained on ‘A is B’ fail to learn ‘B is A'” authored by a collaborative research team from Vanderbilt University, the UK Frontier AI Taskforce, Apollo Research, New York University, the University of Sussex, and the University of Oxford, has unveiled a remarkable shortcoming in auto-regressive large language models (LLMs).

AI Machine Learning & Data Science Nature Language Tech Research

Unveiling the Enigma: Meta AI & UPC Decodes the Inner Workings of Large Scale Language Models

In a new paper Neurons in Large Language Models: Dead, N-gram, Positional, a research team from Meta AI and Universitat Politècnica de Catalunya conducts comprehensive analysis of a family of Open Pre-trained Transformer Language Models (OPT) up to 66b parameters to provide insights of how feed-forward network (FFN) layers act.

AI Machine Learning & Data Science Research

CMU & Tsinghua U’s Prompt2Model Generates Deployable Models Following Natural Language Instructions

In a new paper Prompt2Model: Generating Deployable Models from Natural Language Instructions, a research team from Carnegie Mellon University and Tsinghua University introduces Prompt2Model, a general-purpose approach that is able to use prompting technique to specify system behavior while resulting in a deployable special purpose model that enjoys all the advantages thereof.