Deep Neural Networks

by Synced 2024-04-30 1

MovieChat+: Elevating Zero-Shot Long Video Understanding to New Heights

A pioneering research group introduces MovieChat, a novel framework tailored to accommodate extensive video durations exceeding 10,000 frames. This innovative system achieves unprecedented performance in deciphering prolonged video content.

by Synced 2024-04-27 3

AI Machine Learning & Data Science Research

CMU & Meta’s TriForce: Turbocharging Long Sequence Generation with 2.31× Speed Boost on A100 GPU

In a new paper TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding, a research team from CMU and Meta introduces TriForce—a hierarchical speculative decoding system tailored for scalable long sequence generation, reaching up to 2.31× on an A100 GPU.

by Synced 2024-04-24 1

AI Machine Learning & Data Science Research

Decoding Code Execution: How DeepMind’s NExT Empowers AI Reasoning

In a new paper NExT: Teaching Large Language Models to Reason about Code Execution, a Google DeepMind research team proposes Naturalized Execution Tuning (NExT), a method aims to equip LLMs with the ability to scrutinize program execution traces and deduce runtime behaviors through chain-of-thought (CoT) rationales.

by Synced 2024-04-22 0

AI Machine Learning & Data Science Research

NVIDIA’s ScaleFold Slashes AlphaFold’s Training Time to 10 Hours

An NVIDIA research team presents ScaleFold, a novel and scalable training methodology tailored for the AlphaFold model, which accomplishes the OpenFold partial training task in a mere 7.51 minutes—over six times faster than the benchmark baseline—ultimately slashing the AlphaFold’s initial training time to a remarkable 10 hours.

by Synced 2024-04-20 1

AI Machine Learning & Data Science Research

DeepMind’s RecurrentGemma Pioneering Efficiency for Open Small Language Models

A Google DeepMind research team introduce RecurrentGemma, an open language model built on Google’s innovative Griffin architecture, which reduces memory usage and facilitates efficient inference on lengthy sequences, thereby unlocking new possibilities for highly efficient small language models in environments where resources are limited.

by Synced 2024-04-18 0

AI Machine Learning & Data Science Research

87% ImageNet Accuracy, 3.8ms Latency: Google’s MobileNetV4 Redefines On-Device Mobile Vision

A Google research team unveils the latest iteration of MobileNets: MobileNetV4 (MNv4). This cutting-edge model boasts an impressive 87% ImageNet-1K accuracy, coupled with an astonishingly low Pixel 8 EdgeTPU runtime of merely 3.8ms.

by Synced 2024-04-16 1

AI Machine Learning & Data Science Research

Unveiling the Black Box: Meta’s LM Transparency Tool Deciphers Transformer Language Models

In a new paper LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models, a research team from Meta, University College London and Universitat Politècnica de Catalunya introduces the LM Transparency Tool (LM-TT), an open-source interactive toolkit designed for dissecting Transformer-based language models.

by Synced 2024-04-10 1

AI Machine Learning & Data Science Research

Revolutionizing Video Understanding: Real-Time Captioning for Any Length with Google’s Streaming Model

In a new paper Streaming Dense Video Captioning, a Google research team proposes a streaming dense video captioning model, which revolutionizes dense video captioning by enabling the processing of videos of any length and making predictions before the entire video is fully analyzed, thus marking a significant advancement in the field.

by Synced 2024-04-08 1

AI Machine Learning & Data Science Research

AURORA-M: A Global Symphony of Innovation as 33 Prestigious Institutions Unify for Open-Source Multilingual Mastery

A collaborative effort involving researchers from 33 institutions presents AURORA-M, the inaugural open-source model not only excels in multilingual understanding and coding tasks but also underscores the collaborative ethos of the open-source community, promoting transparency and accessibility in AI development.

by Synced 2024-04-03 0

AI Machine Learning & Data Science Research

Huawei & Peking U’s DiJiang: A Transformer Achieving LLaMA2-7B Performance at 1/50th the Training Cost

A research team from Huawei and Peking University introduces DiJiang, a groundbreaking Frequency Domain Kernelization approach, which facilitates the transition to a linear complexity model with minimal training overhead, achieving performance akin to LLaMA2-7B across various benchmarks, but at just 1/50th of the training cost.

by Synced 2024-03-31 1

AI Machine Learning & Data Science Research

KCL Leverages Topos Theory to Decode Transformer Architectures

A King’s College London research team delves into a theoretical exploration of the transformer architecture, employing the lens of topos theory. This innovative approach conjectures that the factorization through “choose” and “eval” morphisms can yield effective neural network architecture designs.

by Synced 2024-03-29 1

AI Machine Learning & Data Science Research

Robotic Marvels: Conquering San Francisco’s Streets Through Next Token Prediction

A research team from University of California, Berkeley presents a causal transformer model trained via autoregressive prediction of sensorimotor trajectories, culminating in the remarkable feat of enabling a full-sized humanoid to navigate the streets of San Francisco in a zero-shot manner.

by Synced 2024-03-27 1

AI Machine Learning & Data Science Nature Language Tech Research

First Model-Stealing Attack Reveals Secrets of Black-Box Production Language Models

In a new paper Stealing Part of a Production Language Model, a research team introduces the first model-stealing attack that unveils precise, nontrivial information from black-box production language models such as OpenAI’s ChatGPT or Google’s PaLM-2.

by Synced 2024-03-25 3

AI Machine Learning & Data Science Research

DeepMind & UBC’s Genie: A Revolutionary Leap in Generative AI for Interactive Virtual Worlds

A research team from Google DeepMind and University of British Columbia presents Genie, the first generative interactive environment capable of seamlessly generating a diverse array of controllable virtual worlds based on textual prompts, synthetic images, photographs, and even sketches.

by Synced 2024-03-20 0

AI Machine Learning & Data Science Research

ByteDance’s AnimateDiff-Lightning Shines in State-of-the-Art Video Creation in Lightning Speed

A ByteDance research team presents AnimateDiff-Lightning, a novel approach that utilizes progressive adversarial diffusion distillation, catapulting video generation into a realm of lightning-fast performance while simultaneously achieving unprecedented results in few-step video generation.

by Synced 2024-03-18 1

AI Machine Learning & Data Science Research

Stanford’s VideoAgent Achieves New SOTA of Long-Form Video Understanding via Agent-Based System

In a new paper VideoAgent: Long-form Video Understanding with Large Language Model as Agent, a Stanford University research team introduces VideoAgent, an innovative approach simulates human comprehension of long-form videos through an agent-based system, showcasing superior effectiveness and efficiency compared to current state-of-the-art methods.

by Synced 2024-03-16 2

AI Machine Learning & Data Science Research

DeepMind’s Gemma: Advancing AI Safety and Performance with Open Models

Google introduces Gemma, a suite of lightweight, cutting-edge open models derived from the same research and technology underpinning the powerful Gemini models, which mark a significant leap forward in performance relative to existing open models across academic benchmarks for language comprehension, reasoning, and safety.

by Synced 2024-03-11 1

AI Machine Learning & Data Science Research

Fast Tracks to Diverse Behaviors: VQ-BeT Achieves 5x Speed Surge Compared to Diffusion Policies

In a new paper Behavior Generation with Latent Actions, a research team introduces the Vector-Quantized Behavior Transformer (VQ-BeT), an innovative model offers a solution for behavior generation, addressing multimodal action prediction, conditional generation, and partial observations.

by Synced 2024-03-07 1

AI Machine Learning & Data Science Research

BasedAI: A Decentralized Solution for Seamless Integration of Privacy and Performance in Large Language Models

In a new paper BasedAI: A decentralized P2P network for Zero Knowledge Large Language Models (ZK-LLMs), Based Labs proposes BasedAI, which offers a decentralized approach that seamlessly integrates FHE with LLMs to uphold data confidentiality without sacrificing performance.

by Synced 2024-03-05 1

AI Machine Learning & Data Science Research

Transcend The Boundaries of Language Models: bGPT Enables Deeper Understanding Through Byte Prediction

In a new paper Beyond Language Models: Byte Models are Digital World Simulators, a research team introduces bGPT, a pioneering model engineered explicitly for processing binary data and simulating the digital world through next-byte prediction.

by Synced 2024-02-29 2

AI Machine Learning & Data Science Research

Embracing the Era of 1-Bit LLMs: Microsoft & UCAS’s BitNet b1.58 Redefines Efficiency

In a new paper The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits, a research team introduces a new variant of 1-bit LLMs called BitNet b1.58, which preserves the advantages of the original 1-bit BitNet while ushering in a novel computational paradigm that significantly enhances cost-effectiveness in terms of latency, memory usage, throughput, and energy consumption.

by Synced 2024-02-27 0

AI Machine Learning & Data Science Research

NVIDIA’s Nemotron-4 15B Dominates Multilingual Domain, Defeating 4× Larger Rivals

In a new paper Nemotron-4 15B Technical Report , an NVIDIA research team introduces Nemotron-4 15B. Nemotron-4 15B is comprising 15 billion parameters, is trained on an extensive corpus of 8 trillion text tokens, showcasing unparalleled multilingual capabilities among models of comparable size.

by Synced 2024-02-25 3

AI Machine Learning & Data Science Research

Microsoft’s LongRoPE Breaks the Limit of Context Window of LLMs, Extents it to 2 Million Tokens

In a new paper LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens, a Microsoft research team introduces LongRoPE, a pioneering method that extends the context window of pre-trained LLMs to an impressive 2048k tokens while preserving performance at the original short context window.

by Synced 2024-02-24 2

AI Machine Learning & Data Science Research

Yann LeCun & Randall Balestriero Optimize Deep Learning for Perception Tasks

In a new paper Learning by Reconstruction Produces Uninformative Features For Perception, researchers Randall Balestriero and Yann LeCun shed light on why reconstruction-based learning yields compelling reconstructed samples but falters in delivering competitive latent representations for perception.

by Synced 2024-02-19 2

AI Machine Learning & Data Science Research

Apple’s Keyframer: Redefining Animation Prototyping with Language-Guided Design

An Apple research team introduces Keyframer, a groundbreaking animation prototyping tool fueled by LLM technology. Keyframer facilitates the generation of animations from static images (SVGs), empowering users to explore design alternatives, facilitate comparisons, and foster ideation.

by Synced 2024-02-18 4

AI Machine Learning & Data Science Popular Research

Unveiling Sora: OpenAI’s Breakthrough in Text-to-Video Generation

In a recent technical report, OpenAI introduces Sora, a groundbreaking text-to-video model. Sora stands out for its ability to generate videos and images spanning a wide range of durations, aspect ratios, and resolutions, producing up to a minute of high-definition video content.

by Synced 2024-02-16 0

AI Machine Learning & Data Science Research

DeepMind & Stanford U’s UNFs: Advancing Weight-Space Modeling with Universal Neural Functionals

A research team from Google DeepMind and Stanford University introduces a groundbreaking algorithm known as universal neural functionals (UNFs), which autonomously constructs permutation-equivariant models for any weight space, offering a versatile solution to the architectural constraints encountered in prior works.

by Synced 2024-02-11 2

AI Machine Learning & Data Science Research

Introducing NVIDIA’s Audio Flamingo, the Next Frontier in Audio Language Models

An NVIDIA research team introduces Audio Flamingo, a groundbreaking audio language model that incorporates in-context learning (ICL), retrieval augmented generation (RAG), and multi-turn dialogue capabilities, achieving SOTA performance across various audio understanding tasks.

by Synced 2024-02-07 7

AI Machine Learning & Data Science Nature Language Tech Research

Nomic Embed: The Inaugural Open-Source Long Text Embedding Model Outshining OpenAI’s Finest

In a new paper Nomic Embed: Training a Reproducible Long Context Text Embedder, a Nomic AI research team introduces nomic-embed-text-v1, which marks the inception of the first fully reproducible, open-source, open-weights, open-data text embedding model, capable of handling an extensive context length of 8192 in English.

by Synced 2024-02-05 1

AI Machine Learning & Data Science Research

PokéLLMon Triumph: Georgia Tech Unleashes the First LLM Agent Mastering Human-Level Skills in Pokemon Battles

In a new paper PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models, a Georgia Institute of Technology research team introduces PokéLLMon, a pioneering LLM-embodied agent demonstrating human-competent performance in tactical battle games.

by Synced 2024-01-31 4

AI Machine Learning & Data Science Research

Neural Networks on the Brink of Universal Prediction with DeepMind’s Cutting-Edge Approach

In a new paper Learning Universal Predictors, a Google DeepMind research team proposes the utilization of Universal Turing Machines (UTMs) for generating training data, thereby enhancing meta-learning and enabling trained neural networks capable of mastering universal prediction strategies.

by Synced 2024-01-29 2

AI Machine Learning & Data Science Research

Google and UT Austin’s Game-Changing Approach Distills Vision-Language Models on Millions of Videos

In a new paper Distilling Vision-Language Models on Millions of Videos, a research team introduces a straightforward yet highly effective method to adapt image-based vision-language models to video. The approach involves generating high-quality pseudo-captions for millions of videos, outperforming state-of-the-art methods across various video-language benchmarks.

by Synced 2024-01-27 4

AI Machine Learning & Data Science Research

Stanford U & Open AI’s Meta-Prompting Elevates Language Model Performance, Surpassing Standard Prompting by 17%

In a new paper Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding, the team introduces meta-prompting. This innovative scaffolding approach proves to be highly effective, surpassing standard prompting by 17.1%, expert (dynamic) prompting by 17.3%, and multi-persona prompting by 15.2%.

by Synced 2024-01-23 2

AI Machine Learning & Data Science Research

NVIDIA’s ChatQA Reaches GPT-4 Performance Without Using Data From OpenAI GPT

In a new paper ChatQA: Building GPT-4 Level Conversational QA Models, an NVIDIA research team introduces ChatQA, a suite of conversational question-answering models that achieve GPT-4 level accuracies without relying on synthetic data from OpenAI GPT models.

by Synced 2024-01-22 5

AI Machine Learning & Data Science Research

DeepMind’s GATS: A Novel Module for Seamless Integration of Multimodal Foundation Models

In a new paper GATS: Gather-Attend-Scatter, a Google DeepMind research team introduces Gather-Attend-Scatter (GATS), a pioneering module designed to seamlessly combine pretrained foundation models—whether trainable or frozen—into larger multimodal networks.

by Synced 2024-01-20 10

AI Machine Learning & Data Science Research

Nature’s New Breakthrough: Control Human Language Network via Large Language Model

In a new breakthrough paper Driving and suppressing the human language network using large language models, a research team from Massachusetts Institute of Technology, MIT-IBM Watson AI Lab, University of Minnesota and Harvard University leverages a GPT-based encoding model to identify sentences predicted to elicit specific responses within the human language network.

by Synced 2024-01-17 4

AI Machine Learning & Data Science Research

Google’s AMIE Marks A Significant Milestone Toward Conversational Diagnostic AI

In a new paper Towards Conversational Diagnostic AI, a research team from Google Research and Google DeepMind introduces AMIE (Articulate Medical Intelligence Explorer), an LLM-based AI system meticulously optimized for clinical history-taking and diagnostic dialogues, showcasing superior diagnostic accuracy and outperforming primary care physicians (PCPs).

by Synced 2024-01-09 0

AI Machine Learning & Data Science Nature Language Tech Research

Beyond Behemoths: How Blended Chat AIs Outshine Trillion-Parameters ChatGPT with Elegance

Can a collective of moderately-sized LLMs collaboratively constitute a chat AI with equivalent or superior abilities? Motivated by this query, a new paper “Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM” confirms this idea and introduces the Blended approach.

by Synced 2024-01-07 3

AI Machine Learning & Data Science Nature Language Tech Research

LangSplat: Turbocharging 3D Language Fields with a Mind-Blowing 199x Speed Boost

In a new paper LangSplat: 3D Language Gaussian Splattin, a research team from Tsinghua University and Harvard University introduces LangSplat, a groundbreaking 3D Gaussian Splatting-based method designed for 3D language fields, which surpasses the state-of-the-art LERF method while boasting a remarkable speed improvement of 199 times.

by Synced 2024-01-02 0

AI Machine Learning & Data Science Research

Gemini: Bridging Tomorrow’s Deep Neural Network Frontiers with Unrivaled Chiplet Accelerator Mastery

A research team introduces Gemini, an innovative framework, focusing on both architecture and mapping co-exploration, aims to propel large-scale DNN chiplet accelerators to new heights, achieving an impressive average performance improvement of 1.98× and an energy efficiency boost of 1.41× compared to the state-of-the-art Simba architecture.