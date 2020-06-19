Providing automated monitoring, testing, lineage, versioning and historical information, MLOps (Machine Learning Operations) is a set of practices that helps data scientists collaborate and bridge their workflows in the model development and deployment pipeline.



The popular code-hosting portal GitHub — while a great place to host projects and share code, updates and notes — has traditionally offered its users few such MLOps features. In a bid to change that, GitHub recently introduced a series of free and open-source GitHub Actions that merge data science and machine learning workflows with a software development workflow. Boasts the project page: “GitHub Actions connects all of your tools to automate every step of your development workflow.”



In a GitHub blog post, Staff Machine Learning Engineer Hamel Husain demonstrates how data scientists can create and organize a machine learning pipeline to run on infrastructure, collect metrics and report results.

Husain highlights a number of GitHub Actions for MLOps aimed at data scientists and machine learning researchers:



Orchestrating Machine Learning Pipelines:

Submit Argo Workflows – Allows users to orchestrate machine learning pipelines that run on Kubernetes.

Publish Kubeflow Pipelines to GKE – A platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers.

Jupyter Notebooks:

Run Parameterized Notebooks – Run notebooks programmatically using the Papermill tool.

Repo2Docker Action – Automatically turn data science repositories into Jupyter-enabled Docker containers using repo2docker.

Fastai/fastpages – Share information from Jupyter notebooks as blog posts using GitHub Actions & GitHub Pages.

End-To-End Workflow Orchestration:

Examples and templates for utilizing Azure Machine Learning from GitHub Actions.

Experiment Tracking:

Fetch runs from Weights & Biases – An experiment tracking and logging system for machine learning that is free for open-source projects.

Husain says the GitHub Actions available for MLOps and data science will continue to expand and encourages the research community to refer to the GitHub MLOps page for the most recent GitHub Actions and blog posts, talks, and examples.

Journalist: Fangyu Cai | Editor: Michael Sarazen

