Since NumPy was introduced to the world 15 years ago, the primary array programming library has grown into the fundamental package for scientific computing with Python. NumPy serves as an efficient multi-dimensional container of generic data and plays a leading role in scientific computing. It is an essential component in research analysis pipelines across fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, materials science, engineering, finance and economics. NumPy is open-sourced and has myriad contributors.
But one thing has always been missing. A thorough review paper that is fully representative of the team behind Numpy’s genesis has never been published.
The missing chapter in the NumPy story was written yesterday — with the appearance of the paper Array Programming with NumPy in leading scientific journal Nature.
“NumPy is the foundation upon which the scientific Python ecosystem is constructed,” trumpets the paper. It is so pervasive that niche projects have even developed their own NumPy-like interfaces and array objects. The widespread use of NumPy has already provided ample documentation for researchers and developers, and writing a review paper for an exacting journal such as Nature was undoubtedly challenging and time-consuming. So why do so now?
One of the authors, Senior Research Data Scientist at the Berkeley Institute for Data Science Stefan Van Der Walt, tweeted “Our last paper was ~2010 & not fully representative of the team. While we love that people use our software, many of our team members are in academia where citations count. We hope this will give them the credit needed to receive grant funding & produce more high quality software.”
Although NumPy is not part of Python’s stdlib standard library directory, it underpins almost every Python library that does scientific or numerical computation — SciPy, Matplotlib, scikit-learn and scikit-image — and benefits from a good relationship with Python developers. A best practices culture for making reliable scientific software informs the ecosystem of libraries that build on NumPy, this carefully curated by the developers. Accordingly, many research groups have designed their own large yet complex libraries, to both add application-functionality to the ecosystem and boost their studies. The eht-imaging library developed by the Event Horizon Telescope collaboration, for example, was used in an EHT collaboration for the first imaging of a black hole.
The rapid evolution of data science, machine learning, and artificial intelligence are paralleled in Python’s flourishing throughout the scientific community over the past fifteen years — which seems an eternity in scientific computation. Simply put, NumPy has stood the test of time. Its simple memory model has made it easy to write low-level, hand-optimized code, manipulate NumPy arrays, then pass them back to Python. Array protocols are now a key feature of NumPy, which has been continually refined by dedicated developers to improve utility and simplify adoption. “Because of its inherent simplicity, the NumPy array is the de facto exchange format for array data in Python,” the team proudly notes.
The research team notes in the paper that NumPy still depends heavily on contributions from graduate students and researchers in their free time, and that NumPy developers everywhere welcome this help from the community.
The paper Array Programming with NumPy is on Nature.
Reporter: Fangyu Cai | Editor: Michael Sarazen
This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle. Along with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.
Click here to find more reports from us.
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.