ML | Synced

by Synced 2024-12-31 58

Automating Artificial Life Discovery: The Power of Foundation Models

A research team introduces Automated Search for Artificial Life (ASAL). This novel framework leverages vision-language FMs to automate and enhance the discovery process in ALife research.

by Synced 2024-12-28 36

AI Machine Learning & Data Science Research

Llama 3 Meets MoE: Pioneering Low-Cost High-Performance AI

Researchers from the University of Texas at Austin and NVIDIA proposes upcycling approach, an innovative training recipe enables the development of an 8-Expert Top-2 MoE model using Llama 3-8B with less than 1% of the compute typically required for pre-training.

by Synced 2024-12-26 26

AI Machine Learning & Data Science Research

DeepMind’s JetFormer: Unified Multimodal Models Without Modelling Constraints

A DeepMind research team introduces JetFormer, a Transformer designed to directly model raw data. This model maximizes the likelihood of raw data without depending on any pre-trained components, and is capable of both understanding and generating text and images seamlessly.

by Synced 2024-12-23 27

AI Machine Learning & Data Science Research

NVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere Representation

An NVIDIA research team proposes the normalized Transformer, which consolidates key findings in Transformer research under a unified framework, offering faster learning and reduced training steps—by factors ranging from 4 to 20 depending on sequence length.

by Synced 2024-12-17 21

AI Machine Learning & Data Science Research

From Token to Conceptual: Meta introduces Large Concept Models in Multilingual AI

A research team at Meta introduces the Large Concept Model (LCM), a novel architecture that processes input at a higher semantic level. This shift allows the LCM to achieve remarkable zero-shot generalization across languages, outperforming existing LLMs of comparable size.

by Synced 2024-12-14 54

AI Machine Learning & Data Science Research

NVIDIA’s Hybrid: Combining Attention and State Space Models for Breakthrough Performance of Small Language Models

An NVIDIA research team proposes Hymba, a family of small language models that blend transformer attention with state space models, which outperforms the Llama-3.2-3B model with a 1.32% higher average accuracy, while reducing cache size by 11.67× and increasing throughput by 3.49×.

by Synced 2024-12-12 25

AI Machine Learning & Data Science Research

From Response to Query: The Power of Reverse Thinking in Language Models

In a new paper Time-Reversal Provides Unsupervised Feedback to LLMs, a research team from Google DeepMind and Indian Institute of Science proposes Time Reversed Language Models (TRLMs), a framework that allows LLMs to reason in reverse—scoring and generating content in a manner opposite to the traditional forward approach.

by Synced 2024-12-09 48

AI Machine Learning & Data Science Research

Yann LeCun Team’s New Research: Revolutionizing Visual Navigation with Navigation World Models

In a new paper Navigation World Models, a research team from Meta, New York University and Berkeley AI Research proposes a Navigation World Model (NWM), a controllable video generation model that enables agents to simulate potential navigation plans and assess their feasibility before taking action.

by Synced 2024-12-07 42

AI Machine Learning & Data Science Research

The Future of Vision AI: How Apple’s AIMV2 Leverages Images and Text to Lead the Pack

An Apple research team introduces AIMV2, a family of vision encoders that is designed to predict both image patches and text tokens within a unified sequence. This combined objective enables the model to excel in a range of tasks, such as image recognition, visual grounding, and multimodal understanding.

by Synced 2024-12-05 24

AI Machine Learning & Data Science Research

Redefining Music AI: The Power of Sony’s SoniDo as a Versatile Foundation Model

In a new paper Music Foundation Model as Generic Booster for Music Downstream Tasks, a Sony research team presents SoniDo, a groundbreaking music foundation model that offers robust framework for improving the effectiveness and accessibility of music processing.

by Synced 2024-11-29 24

AI Machine Learning & Data Science Research

DeepMind’s Socratic Learning with Language Games: The Path to Self-Improving Superintelligence

Researchers from Google DeepMind introduce the concept of “Socratic learning.” This refers to a form of recursive self-improvement in artificial intelligence that significantly enhances performance beyond the initial data or knowledge available to the system, as well as a practical framework to implement it.

by Synced 2024-11-28 16

AI Machine Learning & Data Science Research

Revolutionizing AI on a Budget: Apple’s Roadmap for Small Language Models Training Success

Apple researchers conducted a systematic study of the computational bottlenecks and cost-efficiency of training SLMs. Their work evaluates training strategies across diverse cloud infrastructure setups, offering practical insights for improving efficiency and reducing costs.

by Synced 2024-11-26 8

AI Machine Learning & Data Science Research

Redefines Consistency Models”: OpenAI’s TrigFlow Narrows FID Gap to 10% with Efficient Two-Step Sampling

OpenAI researchers introduces TrigFlow, a simplified theoretical framework that identifies the key causes of training instability of consistency models and addresses them with novel improvements in diffusion process parameterization, network architecture, and training objectives.

by Synced 2024-11-25 7

AI Machine Learning & Data Science Research

Precision in Pixels: NVIDIA’s Edify Image Model Combines High Quality with Unmatched Control

In a new paper Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models, an NVIDIA research team introduces Edify Image—a suite of pixel-based diffusion models that achieve high-resolution image synthesis with exceptional control and precision.

by Synced 2024-11-19 10

AI Machine Learning & Data Science Research

Meta’s Dualformer: Bridging Fast and Slow Thinking in Transformers for Superior AI Reasoning

In a new paper Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces, a Meta research team presents Dualformer, a single Transformer model that merges both fast and slow reasoning modes within a unified framework.

by Synced 2024-11-17 6

AI Machine Learning & Data Science Research

NVIDIA’s OMCAT: A Breakthrough in Cross-Modal Temporal Understanding for Multimodal AI

An NVIDIA research team introduces OMCAT: Omni Context Aware Transformer in their new paper, presenting both OCTAV, a unique dataset aimed at capturing event transitions across audio and video, and OMCAT, a model that employs RoTE (Rotary Time Embeddings).

by Synced 2024-11-15 15

AI Machine Learning & Data Science Research

Stanford U’s Tutor CoPilot Transforms Real-Time Tutoring with AI-Driven Expert Guidance

A Stanford University research team presents Tutor CoPilot, a new model that offers expert-level guidance to tutors in real time. This study is the first of its kind—a randomized controlled trial testing a Human-AI system in live tutoring scenarios.

by Synced 2024-11-12 9

AI Machine Learning & Data Science Research

Bridging the Gap: Induction-Head Ngram Models for Efficient, Interpretable Language Modeling

A research team introduces a novel approach called Induction-head ngram models (Induction-Gram). This technique merges the interpretability and efficiency of n-gram models with insights from neural LLMs to enhance language modeling performance.

by Synced 2024-11-07 6

AI Machine Learning & Data Science Research

Self-Evolving Prompts: Redefining AI Alignment with DeepMind & Chicago U’s eva Framework

A research team from DeepMind and Chicago University presents a novel approach to Reinforcement Learning from Human Feedback. The proposed eva introduces a flexible, scalable framework that leverages any RLHF algorithm to drive more effective alignment with human values

by Synced 2024-11-05 5

AI Machine Learning & Data Science Research

Unlocking Turing Completeness: How Large Language Models Achieve Universal Computation Without Assistance

A research team from Google DeepMind and the University of Alberta presents evidence that transformer-based LLMs using autoregressive decoding can indeed support universal computation without any external adjustments or modifications to model weights.

by Synced 2024-10-30 5

AI Machine Learning & Data Science Research

From OCR to Multi-Image Insight: Apple’s MM1.5 with Enhanced Text-Rich Image Understanding and Visual Reasoning

Building on MM1’s success, Apple’s new paper, MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning, introduces an improved model family aimed at enhancing capabilities in text-rich image understanding, visual grounding, and multi-image reasoning.

by Synced 2024-10-28 5

AI Machine Learning & Data Science Research

AI Self-Evolution: How Long-Term Memory Drives the Next Era of Intelligent Models

A research team investigates AI self-evolution. Their work examines how models enhanced with Long-Term Memory (LTM) can adapt and evolve through interaction with their environments, a key step toward achieving more dynamic AI.

by Synced 2024-10-25 4

AI Machine Learning & Data Science Research

Breaking Barriers in Cellular Automata with CAX: Faster, Scalable, and Open for All

In a new paper CAX: Cellular Automata Accelerated in JAX, a research team introduces Cellular Automata Accelerated in JAX, a powerful open-source library designed to enhance CA research, which enables rapid CA simulations through extensive parallelization on various hardware accelerators, including CPUs, GPUs, and TPUs.

by Synced 2024-10-23 6

AI Machine Learning & Data Science Research

LLMs as Code Architects: Meta’s New Approach to Precise Code Transformations

In a new paper Don’t Transform the Code, Code the Transforms: Towards Precise Code Rewriting using LLMs, a Meta research team proposes a novel chain-of-thought strategy to efficiently generate code transformations using LLMs. Their approach enables LLMs to derive transformations based on a small set of input/output examples.

by Synced 2024-10-21 4

AI Machine Learning & Data Science Research

Thinking Fast and Slow: Google DeepMind’s Dual-Agent Architecture for Smarter AI

A Google DeepMind research team proposes a biologically-inspired dual-system framework for intelligent agents. This “Talker-Reasoner” architecture aligns with Kahneman’s concept, where System 1 is fast and intuitive, while System 2 is slower and deliberative.

by Synced 2024-10-16 5

AI Machine Learning & Data Science Research

From Dense to Dynamic: NVIDIA’s Innovations in Upcycling LLMs to Sparse MoE

In a new paper Upcycling Large Language Models into Mixture of Experts, an NVIDIA research team introduces a new “virtual group” initialization technique to facilitate the transition of dense models into fine-grained MoE structures.

by Synced 2024-10-12 5

AI Machine Learning & Data Science Research

Web Data to Real-World Action: Enabling Robots to Master Unseen Tasks

A research team presents a novel language-conditioned robot manipulation framework called Gen2Act, which achieves generalization to unseen tasks using publicly available web data, eliminating the need to collect specific robot data for every task.

by Synced 2024-10-09 3

AI Machine Learning & Data Science Research

Scaling Multi-Objective Optimization: Meta & FAIR’s CGPO Advances General-purpose LLMs

In a new paper The Perfect Blend: Redefining RLHF with Mixture of Judges, a research team from Meta GenAI and FAIR developed Constrained Generative Policy Optimization (CGPO), which offers a more structured approach to RLHF, advancing the performance of general-purpose LLMs.

by Synced 2024-10-07 3

AI Machine Learning & Data Science Research

Instant 3D Vision: Apple’s Depth Pro Delivers High-Precision Depth Maps in 0.3 Seconds

Apple introduces Depth Pro, a state-of-the-art foundation model designed for zero-shot metric monocular depth estimation. This model can generate high-resolution depth maps with exceptional clarity and fine detail, producing a 2.25-megapixel depth map in just 0.3 seconds on a standard GPU.

by Synced 2024-10-03 8

AI Machine Learning & Data Science Research

Law of the Weakest Link: Advancing Large Language Models Through Cross-Capability

A joint research team from Meta and the University of Illinois Urbana-Champaign introduces CrossEval, a benchmark designed to assess both individual and cross capabilities. Their findings demonstrate that LLMs often adhere to the “Law of the Weakest Link”—where performance on complex tasks is limited by the weakest capability.

by Synced 2024-09-30 17

AI Machine Learning & Data Science Research

Google’s Zero-Shot Cross-Lingual Voice Transfer for Dysarthric Speakers

In a new paper Zero-shot Cross-lingual Voice Transfer for TTS, a Google research team presents a new VT module that seamlessly integrates into a multilingual TTS system, enabling voice transfer across languages.

by Synced 2024-09-28 12

AI Machine Learning & Data Science Research

Practical Lossless Text Compression: FineZip Delivers 54x Speed Boost via Large Language Models

In a new paper FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression, a research team from UC Berkeley and NYU introduces FineZip, a novel LLM-based compression system designed to significantly reduce compression time.

by Synced 2024-09-23 8

AI Machine Learning & Data Science Research

Microsoft’s MarS: A Game-Changer in Financial Market Simulations Powered by Generative AI

A Microsoft Research Asia research team introduces MarS, a financial market simulation engine powered by a Large Market Model, which addresses the unique demands of modeling the market impact of orders while enabling highly realistic, controllable simulations.

by Synced 2024-09-20 10

AI Machine Learning & Data Science Research

MIT’s SciAgents: Automating Scientific Discovery with AI-Powered Graph Reasoning

A research team presents SciAgents which aims to automate the process of scientific discovery by revealing hidden interdisciplinary relationships that traditional research methods often overlook. SciAgents operates on a scale, precision, and exploratory power that far surpasses human-driven approaches.

by Synced 2024-09-17 4

AI Machine Learning & Data Science Research

Stanford’s Landmark Study: AI-Generated Ideas Rated More Novel Than Expert Concepts

A Sandford U’s research team introduces an experimental framework aimed at evaluating LLMs’ ability to generate research ideas. This study, the first of its kind, compares the ideation capabilities of over 100 expert NLP researchers against an LLM-based ideation system.

by Synced 2024-09-13 7

AI Machine Learning & Data Science Research

Revolutionizing Autonomous Agents: Salesforce’s xLAM Outperforms GPT-4

A Salesforce AI Research team presents the xLAM series, a collection of large action models designed to enhance the performance of open-source LLMs for autonomous AI agents. This work aims to accelerate innovation in the field and make high-performance models for agent tasks more accessible.

by Synced 2024-09-11 16

AI Machine Learning & Data Science Research

Outperforming Giants: TinyAgent’s Edge-Based Solution Surpasses GPT-4-Turbo

A research team introduces TinyAgent, a framework designed to train and deploy small, task-specific language models capable of performing function calls for agentic systems at the edge, which outperforms larger models such as GPT-4-Turbo in this specific function-calling ability.

by Synced 2024-09-09 889

AI Machine Learning & Data Science Research

Microsoft’s Fully Pipelined Distributed Transformer Processes 16x Sequence Length with Extreme Hardware Efficiency

A Microsoft research team introduces the Fully Pipelined Distributed Transformer, which leverages the multiple memory hierarchies available in modern GPU clusters, enhancing hardware efficiency and cost-effectiveness while achieving exceptionally high Model FLOPs Utilization (MFU).

by Synced 2024-09-06 25

AI Machine Learning & Data Science Research

Google’s GameNGen: Bringing Real-Time Game Simulation to Life with Neural Models

In a new paper Diffusion Models Are Real-Time Game Engines, a Google research team presents GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with complex environments over extended sequences, maintaining high-quality output.

by Synced 2024-09-04 7

AI Machine Learning & Data Science Research

Samsung’s MobileQuant: Bringing High-Performance Language Models to Your Pocket

A research team from Samsung makes a first attempt to facilitate LLM deployment on edge devices using integer-only quantization. The proposed MobileQuant, is a post-training quantization technique that reduces both inference latency and energy consumption while preserving accuracy comparable to those achieved with 16-bit activations.