Machine Learning & Data Science

by Synced 2024-09-06 25

Google’s GameNGen: Bringing Real-Time Game Simulation to Life with Neural Models

In a new paper Diffusion Models Are Real-Time Game Engines, a Google research team presents GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with complex environments over extended sequences, maintaining high-quality output.

by Synced 2024-09-04 7

AI Machine Learning & Data Science Research

Samsung’s MobileQuant: Bringing High-Performance Language Models to Your Pocket

A research team from Samsung makes a first attempt to facilitate LLM deployment on edge devices using integer-only quantization. The proposed MobileQuant, is a post-training quantization technique that reduces both inference latency and energy consumption while preserving accuracy comparable to those achieved with 16-bit activations.

by Synced 2024-08-30 5

AI Machine Learning & Data Science Research

NYU & Stanford’s GPUDrive: Achieving Over 1 Million Steps per Second in Multi-Agent Driving Simulations

A research team presents GPUDrive, a GPU-accelerated multi-agent simulator built on the Madrona Game Engine, which is capable of generating over a million experience steps per second, making it a game-changer for applying sample-inefficient yet powerful reinforcement learning algorithms to multi-agent planner design.

by Synced 2024-08-29 4

AI Machine Learning & Data Science Research

NVIDIA’s Minitron: Compressing Llama 3.1 and Mistral NeMo for Superior Performance in 4B and 8B Models

In a new paper LLM Pruning and Distillation in Practice: The Minitron Approach, an NVIDIA research team presents the Minitron compression strategy, which effectively produces a robust 4B model from Llama 3.1 8B and a cutting-edge Mistral-NeMo-Minitron-8B model derived from Mistral NeMo 12B.

by Synced 2024-08-27 4

AI Machine Learning & Data Science Research

Meta’s Sapiens: Revolutionizing Human Pose, Segmentation, and Depth Estimation with Vision Transformers

In a new paper Sapiens: Foundation for Human Vision Models, a Meta research team introduces Sapiens, a suite of models designed to address four core human-centric vision tasks: 2D pose estimation, body-part segmentation, depth estimation, and surface normal prediction.

by Synced 2024-08-26 2

AI Machine Learning & Data Science Research

Open Sparse Autoencoders Everywhere: The Ambitious Vision of DeepMind’s Gemma Scope

In a new paper Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2, a Google DeepMind research team introduces Gemma Scope, a comprehensive suite of JumpReLU SAEs.

by Synced 2024-08-22 4

AI Machine Learning & Data Science Research

Snowflake’s Arctic-TILT: Matching the Power of Models 1,000x Larger in Document Understanding

A Snowflake research team presents Arctic-TILT, a model that is specifically engineered for large-scale, cost-effective deployment while also being adaptable to various domains. It achieves state-of-the-art performance on benchmarks for both business and long documents.

by Synced 2024-08-17 2

AI Machine Learning & Data Science Research

Apple Intelligence: Unveiling Foundation Models Powering the Future of iOS, iPadOS, and macOS

An Apple research team introduces the foundation language models developed to power Apple Intelligence features. These models include a ∼3 billion parameter model optimized for efficient on-device performance and a larger server-based model designed for Private Cloud Compute.

by Synced 2024-08-15 9

AI Machine Learning & Data Science Research

Google DeepMind’s Robot Mastering Human-Level Table Tennis

In a new paper Achieving Human Level Competitive Robot Table Tennis, a Google DeepMind research team introduces the first robot agent that attains amateur human-level performance in competitive table tennis.

by Synced 2024-08-13 2

AI Machine Learning & Data Science Research

NVIDIA’s Wolf: World Summarization Framework Beats GPT-4V on Video Captioning by 55.6%

In a new paper Wolf: Captioning Everything with a World Summarization Framework, a research team introduces a novel approach known as the WOrLd summarization Framework (Wolf). This automated captioning framework significantly advances video captioning—both in terms of quality (improved by 55.6%) and similarity (improved by 77.4%)—compared to GPT-4V.

by Synced 2024-08-09 4

AI Machine Learning & Data Science Research

From 500 Tokens to One: The Breakthrough Power of Cambridge U’s 500xCompressor

In a new paper 500xCompressor: Generalized Prompt Compression for Large Language Models, a Cambridge U team proposes the 500xCompressor, a method designed to condense extensive natural language contexts into a minimum of just one special token, achieving compression ratios ranging from 6x to 480x.

by Synced 2024-08-06 9

AI Machine Learning & Data Science Research

Llama 3: Meta AI’s Multilingual and Multimodal Marvel

In a new paper The Llama 3 Herd of Models, a Meta AI research team presents Llama 3, a new set of foundation models for language, delivering competitive performance comparing to state-of-the-art language models such as GPT-4 on a plethora of tasks.

by Synced 2024-07-31 2

AI Machine Learning & Data Science Research

From YouTube to Keys: Transforming Internet Data into Robotic Musical Talent

In a new paper PianoMime: Learning a Generalist, Dexterous Piano Player from Internet Demonstrations, a research team introduces PianoMime, a framework for training a robot to play the piano using internet-sourced demonstrations.

by Synced 2024-07-30 7

AI Machine Learning & Data Science Research

Unlocking Generalist AI Potential in Software Development with OpenDevin

In a new paper OpenDevin: An Open Platform for AI Software Developers as Generalist Agents, a research team introduces OpenDevin, an Open Platform for AI Software Developers as Generalist Agents. This community-driven platform supports the development of AI agents that interact with software systems.

by Synced 2024-07-26 4

AI Machine Learning & Data Science Research

From Images to Insights: DeepMind’s Versatile Vision-Language Model PaliGemma Achieves SOTA Results

A DeepMind research team release PaliGemma, a robust and versatile vision language model with 3 billion parameters. PaliGemma excels in transfer learning across various vision and language tasks, achieving state-of-the-art performance in a multitude of open-world applications.

by Synced 2024-07-25 4

AI Computer Vision & Graphics Machine Learning & Data Science Research

Automating Video Highlights: Breakthrough Unsupervised Method Leverages Audio and Visual Cues

A research team from Saskatchewan University and Google introduces an innovative unsupervised method for automatic video highlight detection, eliminating the requirements for manual annotations while achieving superior performance compared to previous methods.

by Synced 2024-07-21 2

AI Machine Learning & Data Science Research

Stanford’s Hypothetical Minds: Revolutionizing Multi-Agent AI with Theory of Mind and Large Language Models

A Stanford University research team proposes Hypothetical Minds, builds on recent advancements in LLM-based agents designed for multi-agent environments, aiming to enhance adaptability in competitive, cooperative, and mixed-motive scenarios with concealed information.

by Synced 2024-07-18 8

AI Machine Learning & Data Science Nature Language Tech Research

Revolutionizing Transformers: DeepMind’s PEER Layer and the Power of a Million Experts

A DeepMind research team introduces PEER, a innovative layer design leverages the product key technique for sparse retrieval from an extensive pool of tiny experts (over a million), which unlocks the potential for further scaling transformer models while maintaining computational efficiency.

by Synced 2024-07-16 2

AI Machine Learning & Data Science Research

Overcoming Computational Challenges in Large Language Model Inference with MInference 1.0

A research team from Microsoft and University of Surrey introduces MInference (Milliontokens Inference), which employs a sparse calculation approach designed to expedite the pre-filling of long-sequence processing. It can reduce inference latency by up to 10 times on an A100 GPU while preserving accuracy.

by Synced 2024-07-12 4

AI Machine Learning & Data Science Research

Mastering Enterprise Chatbots: NVIDIA’s Guide to Building Secure RAG-Based Chatbots with Generative AI

In a new paper FACTS About Building Retrieval Augmented Generation-based Chatbots, an NVIDIA research team introduces the FACTS framework, designed to create robust, secure, and enterprise-grade RAG-based chatbots.

by Synced 2024-07-08 7

AI Machine Learning & Data Science Research

Meta AI Unveils LLM Compiler for Advanced Code and Compiler Optimization

A Meta AI research team introduces Meta Large Language Model Compiler, a suite of robust, openly available, pre-trained models is specifically designed for code optimization tasks, aiming to provide a scalable, cost-effective foundation for further research and development in compiler optimization.

by Synced 2024-07-03 4

AI Machine Learning & Data Science Research

Google’s SecBoost: Boosting Any Loss Function Beyond Zeroth-Order Limits

In a new paper How to Boost Any Loss Function, a Google research team provides a constructive, formal answer, demonstrating that any loss function can be optimized with boosting.

by Synced 2024-07-01 10

AI Machine Learning & Data Science Research

Achieving 8× Performance Gains with Reinforcement Learning on Synthetic Data in Large Language Models

In a new paper RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold, a research team provides insights into how synthetic data affects performance, suggesting that a specific schema can achieve consistent gains over using only positive data, achieving performance by 8× in synthetic data volume.

by Synced 2024-06-28 7

AI Machine Learning & Data Science Research

4.5x Performance Boost: University of Illinois’ Muti-Agent AI System Takes on Cyber Threats

A research team from University of Illinois Urbana-Champaign introduces HPTSA, a multi-agent system that significantly advances cybersecurity exploits, achieving up to 4.5 times better performance on a benchmark of 15 real-world vulnerabilities compared to previous efforts.

by Synced 2024-06-25 6

AI Machine Learning & Data Science Research

Oxford U & DeepMind Harness Cultural Accumulation in Reinforcement Learning

In a new paper Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning, a research team from the University of Oxford and Google DeepMind introduces methods to achieve cultural accumulation in Reinforcement Learning (RL) agents. This research opens new pathways for modeling human culture through artificial systems.

by Synced 2024-06-21 3

AI Machine Learning & Data Science Research

Contrastive Learning Advances Sleep Science: Superior Multi-Modal Model Enhances Disorder Detection

In a new paper SleepFM: Multi-modal Representation Learning for Sleep Across Brain Activity, ECG and Respiratory Signals, a research team introduces SleepFM, the first attempt at developing a multi-modal contrastive learning (CL) approach for PSG analysis, outperforming baselines in tasks like demographic attribute prediction and sleep stage classification.

by Synced 2024-06-19 6

AI Machine Learning & Data Science Research

Google’s Proofread: AI-Driven Typing Accuracy in One Tap

In a new paper Proofread: Fixes All Errors with One Tap, a Google research team introduces Proofread, an innovative Gboard feature powered by a server-side LLM. This feature allows for seamless sentence and paragraph corrections with a single tap. Launched on Pixel 8 devices, it benefits thousands of users daily.

by Synced 2024-06-17 4

AI Machine Learning & Data Science Research

AI Pioneers Gather at BAAI 2024: Unveiling Innovations in Large-Scaled AI Models for Language, Multimodal, Embodied, Bio-Computing, and FlagOpen 2.0

“Global Vision, Ideas in Collision, Leading Cutting-Edge Innovations” – The 6th annual BAAI Conference successfully concluded on June 15. Over 200 AI scholars and industry leaders gathered to discuss the trajectories and applications of advanced AI technologies.

by Synced 2024-06-15 2

AI Machine Learning & Data Science Research

Stanford & CZ Biohub’s TEXTGRAD: Transforming AI Optimization with Textual Feedback

In a new paper TextGrad: Automatic ‘Differentiation’ via Text, a research team from Stanford University and CZ Biohub introduces TEXTGRAD, a robust framework that performs automatic differentiation through text. In this system, LLMs generate comprehensive, natural language suggestions to optimize variables in computation graphs.

by Synced 2024-06-11 58

AI Machine Learning & Data Science Research

Microsoft’s VALL-E 2: First Time Human Parity in Zero-Shot Text-to-Speech Achieved

In a recent new paper VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers, a Microsoft research team presents VALL-E 2, the latest advancement in neural codec language models. This innovation marks a milestone in zero-shot TTS synthesis by achieving human parity for the first time.

by Synced 2024-06-09 2

AI Machine Learning & Data Science Research

Matrix Multiplication-Free Language Models Maintain Top-Tier Performance at Billion-Parameter Scales

In a new paper Scalable MatMul-free Language Modeling, a research team introduces the first scalable MatMul-free language model, demonstrating that it is possible to completely eliminate MatMul operations from large language models (LLMs) while maintaining robust performance, even at billion-parameter scales.

by Synced 2024-06-05 2

AI Machine Learning & Data Science Research

From Text to Tunes: The Game-Changing Impact of Instruct-MusicGen on Music Production

A research team introduce Instruct-MusicGen, an innovative method that fine-tunes a pretrained MusicGen model to efficiently follow editing instructions, delivering superior performance across various tasks compared to existing benchmarks.

by Synced 2024-06-01 3

AI Machine Learning & Data Science Research

DeepMind’s Zipper: Fusing Unimodal Generative Models into Multimodal Powerhouses

In a new paper Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities, a Google DeepMind research team introduces Zipper, a multi-tower decoder architecture. This architecture can flexibly combine multimodal generative models from independently pre-trained unimodal decoders and can be reused and repurposed in new multimodal combinations.

by Synced 2024-05-29 4

AI Machine Learning & Data Science Research

NVIDIA’s NV-Embed: Superior Performance in Embedding Tasks Without Proprietary Data

In a new paper NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models, an NVIDIA research team introduces NV-Embed. This generalist embedding model significantly boosts the performance of decoder-only LLMs in embedding and retrieval tasks while maintaining simplicity and reproducibility.

by Synced 2024-05-25 3

AI Machine Learning & Data Science Research

Unveiling the Secret Linearity of Transformers: Further Advance Model Efficiency and Performance

In a new paper Your Transformer is Secretly Linear, a research team uncovers a near-perfect linear relationship in transformations between sequential layers and introduces a novel distillation technique that approximates certain layers linearly while preserving model performance.

by Synced 2024-05-23 14

AI Machine Learning & Data Science Research

MedVersa: A Game-Changer Generalist Learner for Versatile Medical Image Interpretation

In a new paper A Generalist Learner for Multifaceted Medical Image Interpretation, a research team proposes MedVersa, a generalist AI model designed to enable flexible learning and tasking for medical image interpretation.

by Synced 2024-05-21 2

AI Machine Learning & Data Science Research

Generalizable Audio AI: Discover the Power of SpeechVerse by Amazon AWS AI Labs

In a new paper SpeechVerse: A Large-scale Generalizable Audio Language Model, a research team from Amazon AWS AI Labs introduces SpeechVerse, a robust multi-task framework that leverages supervised instruction fine-tuning to achieve strong performance across various speech tasks.

by Synced 2024-05-15 2

AI Machine Learning & Data Science Research

Meta’s Imagine Flash: Pioneering Ultra-Fast and High-Fidelity Images Generation Within 3 Steps

In a new paper Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation, a Meta GenAI research team introduces an innovative distillation framework aimed at enabling high-fidelity, diverse sample generation within just one to three steps. This framework surpasses existing competitors in both quantitative metrics and human evaluations.

by Synced 2024-05-13 6

AI Machine Learning & Data Science Research

IBM’s Granite Code: Powering Enterprise Software Development with AI Precision

An IBM research team introduces the Granite Code model family. Specifically optimized for enterprise software development workflows, these models excel across a spectrum of coding tasks, rendering them versatile and well-suited for diverse coding challenges.

by Synced 2024-05-08 7

AI Machine Learning & Data Science Research

Unveiling Google’s Med-Gemini: Revolutionizing Medical AI with Cutting-Edge Capabilities

a research team from Google and Verily introduce Med-Gemini, a family of highly proficient multimodal models is tailored for medical tasks, boasting the capacity to seamlessly integrate web search functionality and adapt efficiently to new modalities through customized encoders.