The incredible generative capabilities of Large Language Models (LLMs) have ushered in a new era of automation in coding tasks. Applications like Amazon Code Whisperer, GitHub Copilot, and Replit have become ubiquitous, harnessing LLMs to effortlessly complete code segments or execute code modifications based on natural language instructions.
Despite their remarkable performance, these tools encounter challenges when tackling repository-level coding tasks. These tasks involve extensive code alterations across entire codebases, such as package migration or the addition of type annotations and other specifications.
In a recent paper, “CodePlan: Repository-level Coding using LLMs and Planning,” a team from Microsoft Research introduces CodePlan—a versatile framework designed to address the complexities of repository-level coding tasks, encompassing extensive code changes across large, interconnected codebases.
The team highlights the following key contributions:
- Problem Formalization: The team pioneers the formalization of the problem of automating repository-level coding tasks with LLMs, necessitating the analysis of code changes’ effects and their propagation throughout the repository.
- Planning Paradigm: They conceptualize repository-level coding as a planning problem and devise CodePlan, a task-agnostic framework. This framework incorporates an innovative blend of incremental dependency analysis, change impact assessment, and an adaptive planning algorithm.
- Empirical Validation: The team conducts experiments involving two repository-level coding tasks, employing the gpt-4-32k model. The tasks involve package migration for C# repositories and temporal code edits for Python repositories.
- Performance Superiority: Their results showcase CodePlan’s superior alignment with the ground truth when compared to baseline methods.
The primary objective of this research is to develop a repository-level coding system capable of autonomously generating derived specifications for edits, thus achieving a valid repository state. In this context, “validity” refers to compliance with predefined correctness conditions, which can be instantiated in various ways, such as error-free building, successful static analysis, adherence to a type system, passing a battery of tests, or meeting verification tool criteria.
CodePlan, the proposed solution, is designed to synthesize a multi-step plan for resolving repository-level coding tasks. As depicted in Figure 2, CodePlan takes as input a repository, a task accompanied by initial specifications expressed through natural language instructions or initial code edits, a correctness oracle, and an LLM.
Specifically, CodePlan constructs a plan graph where each node in the graph identifies a code edit obligation that the LLM needs to discharge and an edge indicates that the target node needs to be discharged consequent to the source node. CodePlan monitors the code edits and adaptively extends the plan graph. Once all the steps in a plan are completed, the repository is analyzed by the oracle. The task is completed if the oracle validates the repository. If it finds errors, the error reports are used as seed specifications for the next round of plan generation and execution.
In their empirical study, the research team compares CodePlan against a baseline approach that employs a build system to iteratively identify breaking changes and leverages an LLM to rectify them. Notably, CodePlan successfully enables 5 out of 6 repositories to pass the validity checks, while the baselines fail to achieve this outcome. This underscores the superiority of CodePlan over the oracle-guided repair technique employed by the baseline.
In conclusion, CodePlan presents a promising avenue for automating intricate repository-level coding tasks, offering gains in both productivity and accuracy.
The paper CodePlan: Repository-level Coding using LLMs and Planning on arXiv.
Author: Hecate He | Editor: Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.