Site icon Synced

IBM’s Granite Code: Powering Enterprise Software Development with AI Precision

In recent years, there has been remarkable advancement in Large Language Models (LLMs) capable of generating and manipulating code. A variety of models exhibiting impressive coding capabilities have emerged. Nevertheless, significant gaps persist within the realm of LLMs tailored for code, particularly concerning enterprise software development.

In a new paper Granite Code Models: A Family of Open Foundation Models for Code Intelligence, an IBM research team introduces the Granite Code model family. Specifically optimized for enterprise software development workflows, these models excel across a spectrum of coding tasks, rendering them versatile and well-suited for diverse coding challenges.

Comprising decoder-only code models geared towards code generative tasks, the Granite Code models family boasts two primary variants across four distinct sizes (3B, 8B, 20B, and 34B):

The base models undergo comprehensive training via a two-phase strategy. Initially, in phase 1, the model assimilates 3 to 4 trillion tokens sourced from 116 programming languages, ensuring a nuanced grasp of language syntax and structure. Subsequently, in phase 2, the model further refines its capabilities through exposure to 500 billion tokens, drawn from meticulously curated datasets spanning both code and natural language domains.

Derived from the aforementioned base models, the instruct models undergo additional fine-tuning. This process entails leveraging a combination of a refined version of CommitPack, along with datasets featuring natural language instruction following (such as OASST and HelpSteer) and open-source mathematical datasets (such as MathInstruct and MetaMathQA). Synthetically generated code datasets play a pivotal role in augmenting instruction-following and reasoning abilities.

In their empirical investigation, the team conducts extensive evaluations of their code LLMs across a comprehensive array of benchmarks. Results showcase the Granite Code models’ robust performance across all model sizes and benchmarks, often surpassing other open-source code models, even those twice their size.

In summary, the key strengths of Granite Code models include:

Looking ahead, the team is committed to continually enhancing these models’ performance. Future plans include the release of long-context variants, as well as specialized models tailored for Python and Java environments.

The code is available on project’s GitHub. The paper Granite Code Models: A Family of Open Foundation Models for Code Intelligence is on arXiv.


Author: Hecate He | Editor: Chain Zhang


We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

Exit mobile version