Motivation¶
Other array objects¶
NumPy is a simple, rectangular, dense, and in-memory data store. This is great for some applications but isn't complete on its own. It doesn't encompass every single use-case. The following are examples of array objects available today that have different features and cater to a different kind of audience.
- Dask is one of the most popular ones. It allows distributed and chunked computation.
- CuPy is another popular one, and allows GPU computation.
- PyData/Sparse is slowly gaining popularity, and is a sparse, in-memory data store.
- XArray includes named dimensions.
- Xnd is another effort to re-write and modernise the NumPy API, and includes support for GPU arrays and ragged arrays.
- Another effort (although with no Python wrapper, only data marshalling) is xtensor.
Some of these objects can be composed. Namely, Dask both expects and exports the NumPy API, whereas XArray expects the NumPy API. This makes interesting combinations possible, such as distributed sparse or GPU arrays, or even labelled distributed sparse or CPU/GPU arrays.
Also, there are many other libraries (a popular one being scikit-learn) that need a back-end mechanism in order to be able to support different kinds of array objects. Finally, there is a desire to see SciPy itself gain support for other array objects.
__array_function__
and its limitations¶
One of my motivations for working on uarray
were the limitations of the __array_function__
protocol, defined in this proposal. The limitations are threefold:
- It can only dispatch on array objects.
- Consequently, it can only dispatch on functions that accept array objects.
- It has no mechanism for conversion and coercion.
- Since it conflates arrays and backends, only a single backend type per array object is possible.
These limitations have been partially discussed before.
uarray
— The solution?¶
With that out of the way, let's explore uarray
, a library that hopes to resolve these issues, and even though the original motivation was NumPy and array computing, the library itself is meant to be a generic multiple-dispatch mechanism.