The widespread development and deployment of new machine learning (ML) models — often across institutions and borders — has made efficient collaboration between diverse research teams crucial. The increasing complexity of ML models and specialization of teams however makes sharing innovative ML ideas challenging, as it involves bringing all concerned parties to an understanding of the ideas, which does not scale well with the number of ideas or teams.
A Google Brain research team addresses this issue in the new paper PyGlove: Efficiently Exchanging ML Ideas as Code, which extends their PyGlove Library to leverage symbolic rule-based patches and simplify the scalable exchange of ML ideas as code.
Introduced by the authors in their NeurIPS 2020paperPyGlove: Symbolic Programming for Automated Machine Learning, the PyGlove Library supports a novel symbolic programming paradigm that converts a static program into a search space, iterates on the search spaces and search algorithms, and crafts complex search flows to improve results.
The team summarizes the main contributions in their new paper as follows:
- A method for efficiently and scalably sharing complex ML ideas as code using symbolic patches.
- An illustration of how symbolic programming can be used throughout the ML development process.
- The open-sourced PyGlove library and supplementary code used in this paper.
The researchers define an initial ML setup as a collection of components that produce a usable ML model and the improvement step as the editing of the initial setup through the application of conceptual rules representing ideas. Conventional approaches to such improvement procedures typically involve manual work that requires a deep understanding of the model components and their interactions.
The researchers aim to bypass this manual work by programmatically manipulating ML programs to automate the model improvement process. The approach is designed to facilitate the implementation of new ideas but can also be used to apply a given idea to other ML setups without modifying their source code.
PyGlove allows for the expression of rules (patches) in the manual process and encapsulates them into units that are reusable across different ML programs. Patch-sharing is facilitated by a URI-like human-readable string that can be uniquely identified with a global name.
The symbolic patches can also be assembled to create more complex ML ideas and enable their exploration and potential adoption by product teams, freeing the teams to consider many ideas without the attendant burden of code migration.
The study shows that PyGlove can effectively simplify the engineering work required to move from one experiment to the next, enabling the easy expression, reuse, and sharing of new ideas among different teams. The researchers believe this could change the way ML programs are developed, organized, and shared; and have open-sourced the extended PyGlove Library to encourage its further testing and usage by the ML research community.
The PyGlove Library is available on the project’s GitHub. The paper PyGlove: Efficiently Exchanging ML Ideas as Code is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.