Tag: code generation

AI Machine Learning & Data Science Nature Language Tech Research

ServiceNow Research & Hugging Face Release The Stack: 3 TB of Permissively Licensed Source Code for LLMs

In the new paper The Stack: 3 TB of Permissively Licensed Source Code, a team from ServiceNow Research and Hugging Face advances open and responsible research on code LLMs by releasing The Stack, a 3.1 TB dataset of permissively licensed source code in 30 programming languages.

AI Machine Learning & Data Science Research

Microsoft, Penn U & UC San Diego’s TiCoder Framework Generates Code With 90.4% Consistency to User Intent

In the new paper Interactive Code Generation via Test-Driven User-Intent Formalization, a team from Microsoft Research, the University of Pennsylvania, and the University of California, San Diego proposes a workflow for test-driven user-intent formalization that leverages user feedback to generate code that is 90.40 percent consistent with user intent.

AI Machine Learning & Data Science Research

Salesforce’s CodeRL Achieves SOTA Code Generation Results With Strong Zero-Shot Transfer Capabilities

In the new paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning, a Salesforce Research team presents CodeRL, a novel framework for program synthesis tasks that employs pretrained language models (LMs) and deep reinforcement learning (RL) and achieves state-of-the-art performance on the challenging APPS benchmark while also demonstrating impressive zero-shot transfer capabilities.