large language model | Synced

by Synced 2023-09-06 3

Unlocking Limitless Retrieval Power: Google’s MEMORY-VQ Revolutionizes LLMs with Remarkable Compression

In a new paper MEMORY-VQ: Compression for Tractable Internet-Scale Memory, a Google research team introduces MEMORY-VQ, a novel method that significantly reduce storage requirements for memory-based methods while maintaining high performance, achieving 16x compression rate on the KILT benchmark.

by Synced 2023-09-05 5

AI Machine Learning & Data Science Research

MIT’s AskIt Provides A Unified Programming Interface for Code Generation with LLMs

In a new paper AskIt: Unified Programming Interface for Programming with Large Language Models, a MIT CSAIL research team presents AskIt, a domain-specific language (DSL) tailored for LLMs to accommodate a wide variety of tasks, which substantially reducing practitioners’ developmental overhead and effort for software.

by Synced 2023-08-31 5

AI Machine Learning & Data Science Research

CMU & Tsinghua U’s Prompt2Model Generates Deployable Models Following Natural Language Instructions

In a new paper Prompt2Model: Generating Deployable Models from Natural Language Instructions, a research team from Carnegie Mellon University and Tsinghua University introduces Prompt2Model, a general-purpose approach that is able to use prompting technique to specify system behavior while resulting in a deployable special purpose model that enjoys all the advantages thereof.

by Synced 2023-08-29 7

AI Machine Learning & Data Science Nature Language Tech Research

Meta AI Open Sources Code Llama: A SOTA Code-Specialized Llama 2

In a new paper Code Llama: Open Foundation Models for Code, a Meta AI research team releases Code Llama, a family of code-specialized Llama 2 models for code generation and infilling, which achieves state-of-the-art performance against open models on code benchmarks.

by Synced 2023-08-21 5

AI Machine Learning & Data Science Research

Boston U’s Platpus Provides Quick, Cheap, and Powerful Refinement of LLMs, Achieving Top 1 in Open LLM Leaderboard

In a new paper Platypus: Quick, Cheap, and Powerful Refinement of LLMs, a Boston University research team presents Platpus, a family of fine-tuned and merged Large Language Models (LLMs) that achieves the first place in HuggingFace’s Open LLM Leaderboard by performing quick, cheap and powerful refinement of conventional LLMs.

by Synced 2023-08-06 11

AI Machine Learning & Data Science Research

New Study Unleashes The Power of Large Language Models to Master 16000+ Real World APIs

In a new paper ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs, a research team from Tsinghua University, ModelBest Inc., Renmin University of China, Yale University, Tencent Inc. and Zhihu Inc. presents ToolLLM, a general tool-use framework that demonstrates a compelling capability to master 16464 real-world RESTful APIs

by Synced 2023-08-02 4

AI Machine Learning & Data Science Research

3D-LLM: Integrate 3D World Into Language Models

In a new paper 3D-LLM: Injecting the 3D World into Large Language Models, a research team inject the 3D world into large language models and presents 3D-LLMs, a whole new family of models that can capture 3D spatial information to perform 3D-related tasks.

by Synced 2023-07-19 12

AI Machine Learning & Data Science Popular Research

Meta AI’s Llama 2: Open-Sourced LLM with Commercial Rights Reshapes Industry

In a new paper Llama 2: Open Foundation and Fine-Tuned Chat Model, a Meta AI research team presents and releases Llama 2 and Llama 2-Chat, the former one is a family of pretrained and fine-tuned LLMs and the later one is a fine-tuned version of Llama 2 that is optimized for dialogue, paving the way to develop more responsible LLMs.

by Synced 2023-07-18 1

AI Machine Learning & Data Science Research

65-Billion-Parameter Large Model Pretraining Accelerated by 38%, Best Practices for Building LLaMA-like Base Models Open-Source

Colossal-AI—the world’s largest and most active big model development tool and community—utilizes the current most widely used large model, LLaMA, to provide an example of the tool’s groundbreaking pre-training solutions for the 65 billion parameter large model which improves the training speed by 38%.

by Synced 2023-07-09 8

AI Machine Learning & Data Science Research

DeepMind Collaborates on Shaping Personality Traits in LLMs

In a new paper Personality Traits in Large Language Models, a research team from Google, Cambridge University and Keio University proposes principled, validated methods to construct validity of characterizing personalities in LLM, simulates population variance in LLM responses and develops a personality shaping mechanism to control LLM personality traits.

by Synced 2023-06-30 2

AI Machine Learning & Data Science Research

DeepMind’s Proposes New Paradigm for Interfacing Language Model with Robots Through Rewards

In a new paper Language to Rewards for Robotic Skill Synthesis, a Google DeepMind research team proposes a new paradigm to leverage reward functions to interface language and low-level robot actions, which enables non-technical users to steer novel and intricate robot actions without large amount of data or expert knowledge to engineer low-level primitives.

by Synced 2023-06-16 8

AI Machine Learning & Data Science Research

Unlock Open Finance: Columbia U & NYU Open-Source FinGPT to Democratize Financial LLMs

In a new paper FinGPT: Open-Source Financial Large Language Models, a research team from Columbia University and New York University (Shanghai) presents FinGPT, an end-to-end open-source financial large language models (FinLLMs) that democratize financial data to encourage researchers and practitioners to developer user-specified FinLLMs.

by Synced 2023-06-13 5

AI Machine Learning & Data Science Research

Salesforce AI’s CodeTF Library Facilitates Easy LLM Integration for Code Intelligence Tasks

In a new paper CodeTF: One-stop Transformer Library for State-of-the-art Code LLM, a Salesforce AI research team develop CodeTF, an open-source one-stop comprehensive Python library that provides a seamless interface for training and inferencing on code intelligence tasks, aiming to facilitate easy integration of state-of-the-art language models into real-world applications.

by Synced 2023-06-08 5

AI Machine Learning & Data Science Research

Meta AI’s Novel Setup Reveals The Structure and Evolution of Transformers

In a new paper Birth of a Transformer: A Memory Viewpoint, a Meta AI research team introduces a new synthetic setup to explore the structure and evolution of transformer language models, aiming to provide insights of the global vs in-context learning of LLMs.

by Synced 2023-06-01 2

AI Machine Learning & Data Science Research

Google & Stanford U’s DoReMi Significantly Speeds Up Language Model Pretraining

In the new paper DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining, a research team from Google and Stanford University introduces Domain Reweighting with Minimax Optimization (DoReMi), a domain weight optimization strategy that leverages distributionally robust optimization (DRO) to substantially speed up effective language model pretraining.

by Synced 2023-05-29 2

AI Machine Learning & Data Science Nature Language Tech Research

Meta AI’s READ Method for Fine-Tuning Large Transformers Cuts GPU Energy Costs by 84%

In the new paper READ: Recurrent Adaptation of Large Transformers, a Meta AI research team proposes REcurrent ADaption (READ), a lightweight and memory-efficient fine-tuning approach that achieves a 56 percent reduction in memory consumption and an 84 percent reduction in GPU use.

by Synced 2023-05-24 1

AI Machine Learning & Data Science Research

Alibaba & HUST’s ONE-PEACE: Toward a General Representation Model For Unlimited Modalities

In the new paper ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities, a research team from Alibaba Group’s DAMO Academy and the Huazhong University of Science and Technology releases ONE-PEACE, a highly extensible model that can align and integrate representations across vision, audio, and language modalities; opening a path toward the creation of a general representation model for unlimited modalities.

by Synced 2023-05-23 3

AI Machine Learning & Data Science Research

Google’s PaLM 2 Technical Report Details the New Model Family’s Research Advances

In the paper PaLM 2 Technical Report, Google details the modelling advances, data improvements and scaling insights that have enabled its new PaLM2 large language model to achieve state-of-the-art performance.

by Synced 2023-05-17 1

AI Machine Learning & Data Science Research

Salesforce AI’s CodeT5+ Open Code LLMs Flexibly Adapt to Diverse Downstream Code Understanding and Generation Tasks

In the new paper CodeT5+: Open Code Large Language Models for Code Understanding and Generation, a Salesforce AI Research team presents CodeT5+, a novel family of encoder-decoder code foundation large language models that can be flexibly adapted to a wide range of code understanding and generation tasks and outperform various code-related benchmarks.

by Synced 2023-05-16 2

AI Machine Learning & Data Science Nature Language Tech Research

‘May the Source Be With You!’ – BigCode’s Open-Access StarCoder Outperforms All Existing Open Code LLMs

In the new paper StarCoder: May the Source Be With You!, the BigCode community releases StarCoder and StarCoderBase, 15.5B parameter open-access large language models (LLMs) trained on 80+ programming languages. StarCoderBase outperforms all multi-programming-language code LLMs, and StarCoder surpasses all models fine-tuned on Python.

by Synced 2023-05-15 4

AI Machine Learning & Data Science Research

Meet VideoChat: Integrating Language and Video Models to Boost Video Understanding

In the new paper VideoChat: Chat-Centric Video Understanding, a research team from Shanghai AI Laboratory, Nanjing University, the University of Hong Kong, and the Chinese Academy of Sciences presents VideoChat, a groundbreaking end-to-end chat-centric video understanding system that leverages state-of-the-art video and language models to improve spatiotemporal reasoning, event localization, and causal relationship inference.

by Synced 2023-05-09 4

AI Machine Learning & Data Science Research

Microsoft’s Automatic Prompt Optimization Improves Prompts to Boost LLM Performance

In the new paper Automatic Prompt Optimization with “Gradient Descent” and Beam Search, a Microsoft research team presents Automatic Prompt Optimization, a simple and general prompt optimization algorithm that automatically improves prompts for large language models, significantly reducing the time and energy spent on manual prompting approaches.

by Synced 2023-05-03 4

AI Machine Learning & Data Science Research

Optimizing Transformers: Microsoft & RUC’s ResiDual Solves Gradient Vanishing and Representation Collapse Issues

In the new paper ResiDual: Transformer With Dual Residual Connections, a team from Microsoft Research, Microsoft Azure Translation, and Renmin University of China proposes ResiDual, a novel transformer architecture that fuses the connections in post-layer normalization and pre-layer normalization to exploit the benefits of both while also addressing their limitations.

by Synced 2023-04-13 3

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s LLMA Accelerates LLM Generations via an ‘Inference-With-Reference’ Decoding Approach

In the new paper Inference with Reference: Lossless Acceleration of Large Language Models, a Microsoft research team proposes LLMA, an inference-with-reference decoding mechanism that achieves up to 2x lossless speed-ups with identical generation results by exploiting the overlaps between LLM outputs and references.

by Synced 2023-04-04 21

AI Machine Learning & Data Science Research

Bloomberg & JHU’s BloombergGPT: ‘A Best-in-Class LLM for Financial NLP’

In the new paper BloombergGPT: A Large Language Model for Finance, a research team from Bloomberg and Johns Hopkins University presents BloombergGPT, a 50 billion parameter language model trained on a 700 billion token dataset that significantly outperforms current benchmark models on financial tasks.

by Synced 2023-03-27 3

AI Machine Learning & Data Science Research

Microsoft Does a Deep Dive on GPT-4, Finds “Sparks of AGI”

In the new paper Sparks of Artificial General Intelligence: Early Experiments with GPT-4, a Microsoft Research team investigates GPT-4, demonstrating its ability to achieve human-level performance on novel and difficult tasks in domains ranging from mathematics and coding to vision, medicine, law and psychology; and proposing it as an early version of an AGI system.

by Synced 2023-03-23 6

AI Machine Learning & Data Science Nature Language Tech Research

OpenAI, Open Research & UPenn Paper Considers How GPTs Will Impact the US Labour Market

In the new paper GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models, a research team from OpenAI, OpenResearch, and the University of Pennsylvania investigates the potential impact of LLMs like GPT on the US labour market, shedding light on the economic, social, and policy implications.

by Synced 2023-03-22 10

AI Machine Learning & Data Science Nature Language Tech Research

Microsoft’s UPRISE Automatically Retrieves Prompts to Boost the Zero-Shot Performance of Large Language Models

In the new paper UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation, a Microsoft research team introduces a novel approach that tunes a lightweight and versatile retriever to retrieve prompts for any given task input to improve the zero-shot performance of LLMs.