numpy array indexing
One-dimensional arrays
a = np.array([1, 5, 3, 19, 13, 7, 3])
a[3]
19
a[2:5]
array([ 3, 19, 13])
a[2:-1]
array([ 3, 19, 13, 7])
a[:2]
array([1, 5])
a[2::2]
array([ 3, 13, 3])
a[::-1]
array([ 3, 7, 13, 19, 3, 5, 1])
Of course, you can modify elements:
a[3] = 999
a
array([ 1, 5, 3, 999, 13, 7, 3])
You can also modify an ndarray slice:
a[2:5] = [997, 998, 999]
a
array([ 1, 5, 997, 998, 999, 7, 3])
Differences with regular python arrays
a[2:5] = -1
a
array([ 1, 5, -1, -1, -1, 7, 3])
Also, you cannot grow or shrink ndarrays this way:
try:
a[2:5] = [1,2,3,4,5,6] # too long
except ValueError as e:
print(e)
cannot copy sequence with size 6 to array axis with dimension 3
You cannot delete elements either:
try:
del a[2:5]
except ValueError as e:
print(e)
cannot delete array elements
Last but not least, ndarray slices are actually views on the same data buffer. This means that if you create a slice and modify it, you are actually going to modify the original ndarray as well!
a_slice = a[2:6]
a_slice[1] = 1000
a # the original array was modified!
array([ 1, 5, -1, 1000, -1, 7, 3])
If you want a copy of the data, you need to use the copy method:
another_slice = a[2:6].copy()
another_slice[1] = 3000
a # the original array is untouched
array([ 1, 5, -1, 2000, -1, 7, 3])
a[3] = 4000
another_slice # similary, modifying the original array does not affect the slice copy
array([ -1, 3000, -1, 7])
Multi-dimensional arrays
Multi-dimensional arrays can be accessed in a similar way by providing an index or slice for each axis, separated by commas:
b = np.arange(48).reshape(4, 12)
b
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]])
b[1, 2] # row 1, col 2
14
b[1, :] # row 1, all columns
array([12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23])
b[:, 1] # all rows, column 1
array([ 1, 13, 25, 37])
b[1, :]
array([12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23])
b[1:2, :]
array([[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]])
The first expression returns row 1 as a 1D array of shape (12,), while the second returns that same row as a 2D array of shape (1, 12).
Fancy indexing
rows 0 and 2, columns 2 to 4 (5-1)
b[(0,2), 2:5]
array([[ 2, 3, 4],
[26, 27, 28]])
all rows, columns -1 (last), 2 and -1 (again, and in this order)
b[:, (-1, 2, -1)]
array([[11, 2, 11],
[23, 14, 23],
[35, 26, 35],
[47, 38, 47]])
If you provide multiple index arrays, you get a 1D ndarray containing the values of the elements at the specified coordinates.
returns a 1D array with b[-1, 5], b[2, 9], b[-1, 1] and b[2, 9] (again)
b[(-1, 2, -1, 2), (5, 9, 1, 9)]
array([41, 33, 37, 33])
Higher dimensions
c = b.reshape(4, 2, 6)
c
array([[[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11]],
[[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]],
[[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]],
[[36, 37, 38, 39, 40, 41],
[42, 43, 44, 45, 46, 47]]])
c[2, 1, 4] # matrix 2, row 1, col 4
34
c[2, :, 3] # matrix 2, all rows, col 3
array([27, 33])
If you omit coordinates for some axes, then all elements in these axes are returned:
Return matrix 2, row 1, all columns. This is equivalent to c[2, 1, :]
c[2, 1]
array([30, 31, 32, 33, 34, 35])
Ellipsis (...)
matrix 2, all rows, all columns. This is equivalent to c[2, :, :]
c[2, ...]
array([[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
matrix 2, row 1, all columns. This is equivalent to c[2, 1, :]
c[2, 1, ...]
array([30, 31, 32, 33, 34, 35])
matrix 2, all rows, column 3. This is equivalent to c[2, :, 3]
c[2, ..., 3]
array([27, 33])
all matrices, all rows, column 3. This is equivalent to c[:, :, 3]
c[..., 3]
array([[ 3, 9],
[15, 21],
[27, 33],
[39, 45]])
Boolean indexing
b = np.arange(48).reshape(4, 12)
b
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
[36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]])
rows_on = np.array([True, False, True, False])
b[rows_on, :] # Rows 0 and 2, all columns. Equivalent to b[(0, 2), :]
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]])
cols_on = np.array([False, True, False] * 4)
b[:, cols_on] # All rows, columns 1, 4, 7 and 10
array([[ 1, 4, 7, 10],
[13, 16, 19, 22],
[25, 28, 31, 34],
[37, 40, 43, 46]])
np.ix_
You cannot use boolean indexing this way on multiple axes, but you can work around this by using the ix_ function:
b[np.ix_(rows_on, cols_on)]
array([[ 1, 4, 7, 10],
[25, 28, 31, 34]])
np.ix_(rows_on, cols_on)
(array([[0],
[2]]), array([[ 1, 4, 7, 10]]))
If you use a boolean array that has the same shape as the ndarray
, then you get in return a 1D array containing all the values that have True
at their coordinate. This is generally used along with conditional operators:
b[b % 3 == 1]
array([ 1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46])
Iterating
A 3D array (composed of two 3x4 matrices)
c = np.arange(24).reshape(2, 3, 4)
c
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
for m in c:
print("Item:")
print(m)
Item:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
Item:
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]
Note that len(c) == c.shape[0]
for i in range(len(c)): # Note that len(c) == c.shape[0]
print("Item:")
print(c[i])
Item:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
Item:
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]
If you want to iterate on all elements in the ndarray
, simply iterate over the flat
attribute:
for i in c.flat:
print("Item:", i)
Stacking arrays
q1 = np.full((3,4), 1.0)
q1
array([[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.]])
q2 = np.full((4,4), 2.0)
q2
array([[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.]])
q3 = np.full((3,4), 3.0)
q3
array([[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.]])
vstack
Now let's stack them vertically using vstack
:
q4 = np.vstack((q1, q2, q3))
q4
array([[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.]])
q4.shape
(10, 4)
hstack
We can also stack arrays horizontally using hstack
:
q5 = np.hstack((q1, q3))
q5
array([[ 1., 1., 1., 1., 3., 3., 3., 3.],
[ 1., 1., 1., 1., 3., 3., 3., 3.],
[ 1., 1., 1., 1., 3., 3., 3., 3.]])
q5.shape
(3, 8)
This is possible because q1 and q3 both have 3 rows. But since q2 has 4 rows, it cannot be stacked horizontally with q1 and q3:
try:
q5 = np.hstack((q1, q2, q3))
except ValueError as e:
print(e)
all the input array dimensions except for the concatenation axis must match exactly
concatenate
The concatenate
function stacks arrays along any given existing axis.
q7 = np.concatenate((q1, q2, q3), axis=0) # Equivalent to vstack
q7
array([[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 2., 2., 2., 2.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.]])
q7.shape
(10, 4)
As you might guess, hstack is equivalent to calling concatenate with axis=1.
stack
The stack
function stacks arrays along a new axis. All arrays have to have the same shape.
q8 = np.stack((q1, q3))
q8
array([[[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.],
[ 1., 1., 1., 1.]],
[[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.],
[ 3., 3., 3., 3.]]])
q8.shape
(2, 3, 4)
Splitting arrays
Splitting is the opposite of stacking. For example, let's use the vsplit
function to split a matrix vertically.
First let's create a 6x4 matrix:
r = np.arange(24).reshape(6,4)
r
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])
Now let's split it in three equal parts, vertically:
r1, r2, r3 = np.vsplit(r, 3)
r1
array([[0, 1, 2, 3],
[4, 5, 6, 7]])
r2
array([[ 8, 9, 10, 11],
[12, 13, 14, 15]])
r3
array([[16, 17, 18, 19],
[20, 21, 22, 23]])
There is also a split
function which splits an array along any given axis. Calling vsplit
is equivalent to calling split
with axis=0
. There is also an hsplit
function, equivalent to calling split
with axis=1
:
r4, r5 = np.hsplit(r, 2)
r4
array([[ 0, 1],
[ 4, 5],
[ 8, 9],
[12, 13],
[16, 17],
[20, 21]])
r5
array([[ 2, 3],
[ 6, 7],
[10, 11],
[14, 15],
[18, 19],
[22, 23]])
Transposing arrays
The transpose
method creates a new view on an ndarray
's data, with axes permuted in the given order.
For example, let's create a 3D array:
t = np.arange(24).reshape(4,2,3)
t
array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23]]])
Now let's create an ndarray
such that the axes 0, 1, 2
(depth, height, width) are re-ordered to 1, 2, 0
(depth→width, height→depth, width→height):
t1 = t.transpose((1,2,0))
t1
array([[[ 0, 6, 12, 18],
[ 1, 7, 13, 19],
[ 2, 8, 14, 20]],
[[ 3, 9, 15, 21],
[ 4, 10, 16, 22],
[ 5, 11, 17, 23]]])
t1.shape
(2, 3, 4)
By default, transpose
reverses the order of the dimensions:
t2 = t.transpose() # equivalent to t.transpose((2, 1, 0))
t2
array([[[ 0, 6, 12, 18],
[ 3, 9, 15, 21]],
[[ 1, 7, 13, 19],
[ 4, 10, 16, 22]],
[[ 2, 8, 14, 20],
[ 5, 11, 17, 23]]])
t2.shape
(3, 2, 4)