AI Machine Learning & Data Science Research

IBM’s Granite Code: Powering Enterprise Software Development with AI Precision

An IBM research team introduces the Granite Code model family. Specifically optimized for enterprise software development workflows, these models excel across a spectrum of coding tasks, rendering them versatile and well-suited for diverse coding challenges.

In recent years, there has been remarkable advancement in Large Language Models (LLMs) capable of generating and manipulating code. A variety of models exhibiting impressive coding capabilities have emerged. Nevertheless, significant gaps persist within the realm of LLMs tailored for code, particularly concerning enterprise software development.

In a new paper Granite Code Models: A Family of Open Foundation Models for Code Intelligence, an IBM research team introduces the Granite Code model family. Specifically optimized for enterprise software development workflows, these models excel across a spectrum of coding tasks, rendering them versatile and well-suited for diverse coding challenges.

Comprising decoder-only code models geared towards code generative tasks, the Granite Code models family boasts two primary variants across four distinct sizes (3B, 8B, 20B, and 34B):

  • Granite Code Base: Serving as foundational models for code-related tasks.
  • Granite Code Instruct: Instruction-following models fine-tuned through a blend of Git commits paired with human instructions and datasets featuring open-source synthetically generated code instructions.

The base models undergo comprehensive training via a two-phase strategy. Initially, in phase 1, the model assimilates 3 to 4 trillion tokens sourced from 116 programming languages, ensuring a nuanced grasp of language syntax and structure. Subsequently, in phase 2, the model further refines its capabilities through exposure to 500 billion tokens, drawn from meticulously curated datasets spanning both code and natural language domains.

Derived from the aforementioned base models, the instruct models undergo additional fine-tuning. This process entails leveraging a combination of a refined version of CommitPack, along with datasets featuring natural language instruction following (such as OASST and HelpSteer) and open-source mathematical datasets (such as MathInstruct and MetaMathQA). Synthetically generated code datasets play a pivotal role in augmenting instruction-following and reasoning abilities.

In their empirical investigation, the team conducts extensive evaluations of their code LLMs across a comprehensive array of benchmarks. Results showcase the Granite Code models’ robust performance across all model sizes and benchmarks, often surpassing other open-source code models, even those twice their size.

In summary, the key strengths of Granite Code models include:

  • All-rounder Code LLM: Exhibiting competitive or state-of-the-art performance across various code-related tasks such as generation, explanation, debugging, editing, and translation, showcasing their versatility in tackling diverse coding challenges.
  • Trustworthy Enterprise-Grade LLM: All models are trained on data collected following IBM’s AI Ethics principles and guided by IBM’s Corporate Legal team to ensure trustworthy enterprise usage. Furthermore, all Granite Code models are released under the Apache 2.0 license.

Looking ahead, the team is committed to continually enhancing these models’ performance. Future plans include the release of long-context variants, as well as specialized models tailored for Python and Java environments.

The code is available on project’s GitHub. The paper Granite Code Models: A Family of Open Foundation Models for Code Intelligence is on arXiv.


Author: Hecate He | Editor: Chain Zhang


We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

4 comments on “IBM’s Granite Code: Powering Enterprise Software Development with AI Precision

  1. Pingback: IBM の Granite Code: AI 精度でエンタープライズ ソフトウェア開発を強化 | Synced - プロンプトハブ

  2. Pingback: IBM’s Granite Code: Powering Enterprise Software Development with AI Precision -

  3. MaevePratt

    Explore top gaming at https://onlinecasinoazerbaijan.org/reyler/melbet . Discover the excitement of Melbet AZ with thrilling games and bonuses. Read our comprehensive Melbet casino review to see why it’s a favorite among online gaming enthusiasts.

  4. Robert White

    IBM’s Granite Code is a great example of how AI is transforming enterprise software development by enhancing precision and automating complex tasks. As businesses continue to embrace AI-driven solutions, having an efficient system to manage their IT infrastructure becomes critical. Implementing IT asset management software https://itemit.com/asset-tracking/it-asset-management-software/ can help organizations streamline their asset tracking, optimize resource allocation, and ensure the smooth operation of their IT ecosystems. This kind of integration allows businesses to focus on innovation, like IBM, while maintaining control over their assets in a more efficient and data-driven way.

Leave a Reply

Your email address will not be published. Required fields are marked *