Speeding up your code: creating NumPy universal functions with Numba¶
I have read recently this very interesting series of articles about how one can speed up code written in Python.
I encourage you to go read the full series if you want more details, especially on how to use parallelisation and just-in-time compiling to speed up your code.
I want to focus here on the first trick presented, that of vectorizing your code with NumPy, instead of using Python loops and iterators. Don't get me wrong, I love Python iterators and their expressive power, but for numerical computations, they just won't do the job.
Vectorizing your computation can give you a speedup of several orders of magnitude. However, this vectorizing step can sometimes be non-trivial, and requires quite a lot of work, of NumPy twisting and handling harrowing concepts such as strides. This in turn makes your code less understandable and less easily maintainable. But what wouldn't we do for a factor-100 speedup?
What I want to test here is how large a speedup can we gain without vectorizing our Python code, but instead by using Numba to define a (compiled) universal function for NumPy.
Universal functions are vectorized functions operating elementwise on NumPy arrays. One can think of functions such as
np.power. NumPy contains quite a lot of the most usually needed functions, but Numba allows you to define your own through the decorator
It is far more efficient than using
np.apply, which performs a for-loop underneath. On the other hand, vectorized functions defined with Numba are compiled, and should be more efficient. Let's find out how fast we can get without vectorizing by hand.