If you are a developer who has ever wished for a rich collection of machine learning algorithms implemented exclusively in NumPy, then you’ll love Numpy-ml — a new GitHub project that’s winning accolades from across the global machine learning community.
For those unfamiliar with NumPy (Numerical Python), it is one of the most popular libraries for the Python programming language. NumPy is the fundamental package for scientific computing with Python, serving as an efficient multi-dimensional container of generic data. Its set of advanced functions allows users to quickly build a model’s entire computational flow.
Spearheading the Numpy-ml project is David Bourgin, a ML research engineer who received his PhD in Cognitive Science at UC Berkeley. Although Bourgin says Numpy-ml was just “a fun thing for me to do in my spare time” the collection has obviously met a need. In the four days following Bourgin’s project announcement on Reddit his thread received 369 upvotes. The news was also retweeted by interested parties such as DeepMind Research scientist Aida Nematzadeh. The Numpy-ml GitHub has received over 4,600 stars since it was established in April.
The project provides codes for popular neural networks layers such as LSTM, Bidirectional LSTM module, regularizer Dropout, and Hidden Markov model Viterbi decoding. With a total of 62 .py files, the project covers about 30 popular and less-popular machine learning models, with an average of more than 500 lines of code per model. The layer.py file under the neural networks category contains close to 4,000 lines of code.
Bourgin explains “I’ve been slowly building a collection of pure-NumPy (and a little SciPy) implementations of various ML models + building blocks to use for quick reference. The project … might also be useful for others interested in bare-bones implementations of particular models / ideas.”
The algorithms are divided into 11 categories:
- Gaussian mixture model
- Hidden Markov model
- Latent Dirichlet allocation (topic model)
- Neural networks
- Tree-based models
- Linear models
- n-Gram sequence models
- Reinforcement learning models
- Nonparameteric models
Bourgin is welcoming feedback on the project from the ML community. Details are available on his GitHub page.
Journalist: Fangyu Cai | Editor: Michael Sarazen