Almost 50 years ago, computer science pioneer Brian Kernighan wrote: “Debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?” While today’s computers have massively evolved, not much has changed with regard to detecting and repairing bugs in source code, which remains a challenging and time-consuming task that requires expert-level reasoning skills over formal structures and ambiguous information.
In the NeurIPS 2021-accepted paper Self-Supervised Bug Detection and Repair, a Microsoft Research team builds on the promise and potential of recently proposed deep learning-based bug detection methods to introduce BUGLAB, a self-supervised approach for bug detection and repair.
BUGLAB co-trains two models: 1) a bug detector model that learns to detect and repair bugs in code, and 2) a bug selector model that learns to create buggy code for the detector to use as training data.
The bug selector model first decides which bug-introducing rewrites to apply to an input code snippet. The bug detector then tries to locate and repair any added bugs by recognizing the “hardest” possible rewrites that could be applied on the codebase. These two models thus both predict rewrites on code snippets, but with different goals — one aiming to introduce bugs and the other aiming to repair them. To apply the method, the researchers discard the selector model and rely on the trained detector model to locate and repair bugs.
The Microsoft team has open-sourced PYBUGLAB, a BUGLAB implementation for Python that focuses on four common bug types: 1) Variable Misuse, 2) Argument Swapping, 3) Wrong Operator, and 4) Wrong Literal. They also consider rewrite rules for data augmentation to help with generalization, likening this to instances in computer vision where images are rotated or cropped but maintain their original content. PYBUGLAB implements the following rewrites for this purpose:
- Variable Renaming renames a local variable to a random name not already in scope.
- Comment Deletion removes code comments, including docstrings and inline comments. Such comments commonly contain natural language information that is useful for code comprehension, but usually do not affect program semantics.
- Comparison Expression Mirroring swaps the two sides of a comparison operator and changes it appropriately. For example, a<b is transformed to b>a. Note that in cases such as foo() < bar(), this will change the order of execution of foo and bar, possibly altering program semantics.
- If-Else Branch Swapping negates the test condition of an if-else statement or a ternary expressions using DeMorgan’s law and swaps the then body with the else body.
To evaluate the proposed approach, the researchers built PYPIBugs, a dataset containing 2374 real-world small bugs. In experiments, BUGLAB improved on baseline methods by up to 30 percent on PYPIBugs and found 19 previously unknown bugs in open-source software.
While bug detection and repair remains a challenging task in today’s complex systems, the team hopes deep learning methods such as theirs can enable earlier bug detection and help developers speed up software development and engineers deliver more robust software.
The code and PyPIBugs dataset are available on the project’s GitHub. The paper Self-Supervised Bug Detection and Repair is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.