Introducing ndindex, a Python library for manipulating indices of ndarrays
One of the most important features of NumPy arrays is their indexing
semantics. By "indexing" I mean anything that happens inside square brackets,
for example, a[4::-1, 0, ..., [0, 1], np.newaxis]
. NumPy's index semantics
are very expressive and powerful, and this is one of the reasons the library
is so popular.
Index objects can be represented and manipulated directly. For example, the
above index is (slice(4, None, -1), 0, Ellipsis, [0, 1], None)
. If you are
any author of a library that tries to replicate NumPy array semantics, you
will have to work with these objects. However, they are often difficult to
work with:
-
The different types that are valid indices for NumPy arrays do not have a uniform API. Most of the types are also standard Python types, such as
tuple
,list
,int
, andNone
, which are usually unrelated to indexing. -
Those objects that are specific to indexes, such as
slice
andEllipsis
do not make any assumptions about their underlying semantics. For example, Python lets you createslice(None, None, 0)
orslice(0, 0.5)
even thougha[::0]
anda[0:0.5]
would be always be anIndexError
on a NumPy array. -
Some index objects, such as
slice
,list
, andndarray
are not hashable. -
NumPy itself does not offer much in the way of helper functions to work with these objects.
These limitations may be annoying, but are easy enough to live with. The real
challenge when working with indices comes when you try to manipulate them.
Slices in particular are challenging to work with because the rich meaning of
slice semantics. Writing formulas for even very simple things is a real
challenge with slices. slice(start, stop, step)
(corresponding to
a[start:stop:step]
) has fundamentally different meaning depending on whether
start
,stop
, or step
are negative, nonnegative, or None
. As an example,
take a[4:-2:-2]
, where a
is a one-dimensional array. This slices every
other element from the third element to the second from the last. What will
the shape of this sliced array be? The answer is (0,)
if the original shape
is less than 1 or greater than 5, and (1,)
otherwise.
Code that manipulates slices will tend to have a lot of if
/else
chains for
these different cases. And due to 0-based indexing, half-open semantics,
wraparound behavior, clipping, and step logic, the formulas are often quite
difficult to write down.