numpy array indexing

One-dimensional arrays

a = np.array([1, 5, 3, 19, 13, 7, 3])
a[3]

a[2:5]

array([ 3, 19, 13])

a[2:-1]

array([ 3, 19, 13, 7])

a[:2]

array([1, 5])

a[2::2]

array([ 3, 13, 3])

a[::-1]

array([ 3, 7, 13, 19, 3, 5, 1])

Of course, you can modify elements:

a[3] = 999
a

array([ 1, 5, 3, 999, 13, 7, 3])

You can also modify an ndarray slice:

a[2:5] = [997, 998, 999]
a

array([ 1, 5, 997, 998, 999, 7, 3])

Differences with regular python arrays

a[2:5] = -1
a

array([ 1, 5, -1, -1, -1, 7, 3])

Also, you cannot grow or shrink ndarrays this way:

try:
    a[2:5] = [1,2,3,4,5,6]  # too long
except ValueError as e:
    print(e)

cannot copy sequence with size 6 to array axis with dimension 3

You cannot delete elements either:

try:
    del a[2:5]
except ValueError as e:
    print(e)

cannot delete array elements

Last but not least, ndarray slices are actually views on the same data buffer. This means that if you create a slice and modify it, you are actually going to modify the original ndarray as well!

a_slice = a[2:6]
a_slice[1] = 1000
a  # the original array was modified!

array([ 1, 5, -1, 1000, -1, 7, 3])

If you want a copy of the data, you need to use the copy method:

another_slice = a[2:6].copy()
another_slice[1] = 3000
a  # the original array is untouched

array([ 1, 5, -1, 2000, -1, 7, 3])

a[3] = 4000
another_slice  # similary, modifying the original array does not affect the slice copy

array([ -1, 3000, -1, 7])

Multi-dimensional arrays

Multi-dimensional arrays can be accessed in a similar way by providing an index or slice for each axis, separated by commas:

b = np.arange(48).reshape(4, 12)
b

array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]])

b[1, 2]  # row 1, col 2

b[1, :]  # row 1, all columns

array([12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23])

b[:, 1]  # all rows, column 1

array([ 1, 13, 25, 37])

b[1, :]

array([12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23])

b[1:2, :]

array([[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]])

The first expression returns row 1 as a 1D array of shape (12,), while the second returns that same row as a 2D array of shape (1, 12).

Fancy indexing

rows 0 and 2, columns 2 to 4 (5-1)

b[(0,2), 2:5]

array([[ 2, 3, 4],
[26, 27, 28]])

all rows, columns -1 (last), 2 and -1 (again, and in this order)

b[:, (-1, 2, -1)]

array([[11, 2, 11],
[23, 14, 23],
[35, 26, 35],
[47, 38, 47]])

If you provide multiple index arrays, you get a 1D ndarray containing the values of the elements at the specified coordinates.

returns a 1D array with b[-1, 5], b[2, 9], b[-1, 1] and b[2, 9] (again)

b[(-1, 2, -1, 2), (5, 9, 1, 9)]

array([41, 33, 37, 33])

Higher dimensions

c = b.reshape(4, 2, 6)
c

array([[[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11]],
[[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]],
[[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]],
[[36, 37, 38, 39, 40, 41],
[42, 43, 44, 45, 46, 47]]])

c[2, 1, 4]  # matrix 2, row 1, col 4

c[2, :, 3]  # matrix 2, all rows, col 3

array([27, 33])

If you omit coordinates for some axes, then all elements in these axes are returned:

Return matrix 2, row 1, all columns. This is equivalent to c[2, 1, :]

c[2, 1]

array([30, 31, 32, 33, 34, 35])

Ellipsis (...)

matrix 2, all rows, all columns. This is equivalent to c[2, :, :]

c[2, ...]

array([[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])

matrix 2, row 1, all columns. This is equivalent to c[2, 1, :]

c[2, 1, ...]

array([30, 31, 32, 33, 34, 35])

matrix 2, all rows, column 3. This is equivalent to c[2, :, 3]

c[2, ..., 3]

array([27, 33])

all matrices, all rows, column 3. This is equivalent to c[:, :, 3]

c[..., 3]

array([[ 3, 9],
[15, 21],
[27, 33],
[39, 45]])

Boolean indexing

b = np.arange(48).reshape(4, 12)
b

array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]])

rows_on = np.array([True, False, True, False])
b[rows_on, :] # Rows 0 and 2, all columns. Equivalent to b[(0, 2), :]

array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]])

cols_on = np.array([False, True, False] * 4)
b[:, cols_on]  # All rows, columns 1, 4, 7 and 10

array([[ 1, 4, 7, 10],
[13, 16, 19, 22],
[25, 28, 31, 34],
[37, 40, 43, 46]])

np.ix_

You cannot use boolean indexing this way on multiple axes, but you can work around this by using the ix_ function:

b[np.ix_(rows_on, cols_on)]

array([[ 1, 4, 7, 10],
[25, 28, 31, 34]])

np.ix_(rows_on, cols_on)

(array([[0],
[2]]), array([[ 1, 4, 7, 10]]))

If you use a boolean array that has the same shape as the ndarray, then you get in return a 1D array containing all the values that have True at their coordinate. This is generally used along with conditional operators:

b[b % 3 == 1]

array([ 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46])

Iterating

A 3D array (composed of two 3x4 matrices)

c = np.arange(24).reshape(2, 3, 4)
c

array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])

for m in c:
    print("Item:")
    print(m)

Item:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
Item:
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]

Note that len(c) == c.shape[0]

for i in range(len(c)):  # Note that len(c) == c.shape[0]
    print("Item:")
    print(c[i])

Item:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
Item:
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]

If you want to iterate on all elements in the ndarray, simply iterate over the flat attribute:

for i in c.flat:
    print("Item:", i)

Stacking arrays

q1 = np.full((3,4), 1.0)
q1

array([[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.]])

q2 = np.full((4,4), 2.0)
q2

array([[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.]])

q3 = np.full((3,4), 3.0)
q3

array([[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.]])

vstack

Now let's stack them vertically using vstack:

q4 = np.vstack((q1, q2, q3))
q4

array([[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.]])

q4.shape

(10, 4)

hstack

We can also stack arrays horizontally using hstack:

q5 = np.hstack((q1, q3))
q5

array([[ 1., 1., 1., 1., 3., 3., 3., 3.],
[ 1., 1., 1., 1., 3., 3., 3., 3.],
[ 1., 1., 1., 1., 3., 3., 3., 3.]])

q5.shape

(3, 8)

This is possible because q1 and q3 both have 3 rows. But since q2 has 4 rows, it cannot be stacked horizontally with q1 and q3:

try:
    q5 = np.hstack((q1, q2, q3))
except ValueError as e:
    print(e)

all the input array dimensions except for the concatenation axis must match exactly

concatenate

The concatenate function stacks arrays along any given existing axis.

q7 = np.concatenate((q1, q2, q3), axis=0)  # Equivalent to vstack
q7

array([[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.]])

q7.shape

(10, 4)

As you might guess, hstack is equivalent to calling concatenate with axis=1.

stack

The stack function stacks arrays along a new axis. All arrays have to have the same shape.

q8 = np.stack((q1, q3))
q8

array([[[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.]],
[[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.]]])

q8.shape

(2, 3, 4)

Splitting arrays

Splitting is the opposite of stacking. For example, let's use the vsplit function to split a matrix vertically.

First let's create a 6x4 matrix:

r = np.arange(24).reshape(6,4)
r

array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])

Now let's split it in three equal parts, vertically:

r1, r2, r3 = np.vsplit(r, 3)
r1

array([[0, 1, 2, 3],
[4, 5, 6, 7]])

r2

array([[ 8, 9, 10, 11],
[12, 13, 14, 15]])

r3

array([[16, 17, 18, 19],
[20, 21, 22, 23]])

There is also a split function which splits an array along any given axis. Calling vsplit is equivalent to calling split with axis=0. There is also an hsplit function, equivalent to calling split with axis=1:

r4, r5 = np.hsplit(r, 2)
r4

array([[ 0, 1],
[ 4, 5],
[ 8, 9],
[12, 13],
[16, 17],
[20, 21]])

r5

array([[ 2, 3],
[ 6, 7],
[10, 11],
[14, 15],
[18, 19],
[22, 23]])

Transposing arrays

The transpose method creates a new view on an ndarray's data, with axes permuted in the given order.

For example, let's create a 3D array:

t = np.arange(24).reshape(4,2,3)
t

array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23]]])

Now let's create an ndarray such that the axes 0, 1, 2 (depth, height, width) are re-ordered to 1, 2, 0 (depth→width, height→depth, width→height):

t1 = t.transpose((1,2,0))
t1

array([[[ 0, 6, 12, 18],
[ 1, 7, 13, 19],
[ 2, 8, 14, 20]],
[[ 3, 9, 15, 21],
[ 4, 10, 16, 22],
[ 5, 11, 17, 23]]])

t1.shape

(2, 3, 4)

By default, transpose reverses the order of the dimensions:

t2 = t.transpose()  # equivalent to t.transpose((2, 1, 0))
t2

array([[[ 0, 6, 12, 18],
[ 3, 9, 15, 21]],
[[ 1, 7, 13, 19],
[ 4, 10, 16, 22]],
[[ 2, 8, 14, 20],
[ 5, 11, 17, 23]]])

t2.shape

(3, 2, 4)