Large language models (LLMs) have gained popularity in wide spread real-world applications, their impressive natural language processing capability enable them to be commanded and prompted to handle different tasks. Moreover, recent studies found that LLMs’ efficacy can be boosted significantly by utilizing large, in-domain datasets, some of the prominent user cases includes code generation and infilling.
In a new paper Code Llama: Open Foundation Models for Code, a Meta AI research team releases Code Llama, a family of code-specialized Llama 2 models for code generation and infilling, which achieves state-of-the-art performance against open models on code benchmarks.
The team releases a family of code-specialized Llama 2 models with three main variants in three sizes (7B, 13B and 34B parameters):
- Code Llama: a foundational model for code generation tasks.
- Code Llama – Python: a version specialized for Python.
- Code Llama – Instruct: a version fine-tuned with human instructions and self-instruct code synthesis data.
In general, the proposed approach is applying a cascade of training and fine-tuning steps that gradually specializing and increasing the capabilities of Llama 2. Specifically, the team trains Code Llama on 500B tokens on the 7B, 13B, and 34B versions of Llama 2. For infilling models training, they leverage causal masking technique to train the general-purpose 7B and 13B models with an infilling objective.
The researchers use AdamW as optimizer and use the original learning rate of the Llama 2 base model on the 13B and 34B models that they observed can obtain best results. Furthermore, they introduce a dedicated long context fine-tuning (LCFT) stage to effectively handle long sequences.
In their empirical study, the team evaluated Code Llama on a variety of benchmarks, including HumanEval (Chen et al., 2021), MBPP (Austin et al., 2021), and APPS (Hendrycks et al., 2021), MultiPL-E (Cassano et al., 2023) and the GSM8K benchmark (Cobbe et al., 2021).
Code Llama achieves state-of-the-art results against open models, with scores of up to 53% and 55% on HumanEval and MBPP respectively and surpasses all publicly available model on MultiPL-E.
The researchers have released Code Llama to allow for both research and commercial use. They hope more future work can be done to improve LLMs for understanding context and nuance in their instructions.
The paper Code Llama: Open Foundation Models for Code on arXiv.
Author: Hecate He | Editor: Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.