## Numpy personal tips

numpy is one of the essential tools for data analysis and numerical computation. It is a library that is always needed when implementing machine learning, etc. I’ll leave a memo as a personal reminder. For details, please refer to the following official page.

### github

• The file in jupyter notebook format on github is here .

### Author’s environment

The author’s environment and import method are as follows.

!sw_vers

ProductName: Mac OS X
ProductVersion: 10.14.6
BuildVersion: 18G2022

Python -V

Python 3.7.3

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

import numpy as np

np.__version__

'1.16.2'


## Get statistics.

### np.max(x)

Returns the maximum value of an array.

Define $a$ as a second-order tensor.

a = np.array([.
[1,8,3],
[6,5,4],
[7,2,9].
]
)


Define $b$ as a third-order tensor.

b = np.array([
[
[1,8,3],
[6,5,4],
[7,2,9]]
],
[
[1,9,4],
[7,2,5],
[6,8,3]
]
])

print('-' * 20)
print('a : \n',a)
print()
print('np.max(a) : \n',np.max(a))
print()
print('np.max(a, axis=0) : \n',np.max(a, axis=0))
print()
print('np.max(a, axis=1) : \n',np.max(a, axis=1))
print()

print('-' * 20)
print('b : \n',b)
print()
print('np.max(b) : \n',np.max(b))
print()
print('np.max(b, axis=0) : \n',np.max(b, axis=0))

print()
print('np.max(b, axis=1) : \n',np.max(b, axis=1))

print()
print('np.max(b, axis=2 : \n',np.max(b, axis=2))

--------------------
a :
[[1 8 3]]
[6 5 4]]
[7 2 9]]

np.max(a) :
9

np.max(a, axis=0) :
[7 8 9]]

np.max(a, axis=1) :
[8 6 9]

--------------------
b :
[[[1 8 3]]
[6 5 4]]
[7 2 9]]

[[1 9 4]]
[7 2 5]]
[6 8 3]]]

np.max(b) :
9

np.max(b, axis=0) :
[[1 9 4]]
[7 5 5]]
[7 8 9]]

np.max(b, axis=1) :
[[7 8 9]]
[7 9 5]]

np.max(b, axis=2 :
[[8 6 9]]
[9 7 8]]

print('-' * 20)
print('a : \n',a)
print()
print('np.argmax(a) : \n',np.argmax(a))
print()
print('np.argmax(a, axis=0) : \n',np.argmax(a, axis=0))
print()
print('np.argmax(a, axis=1) : \n',np.argmax(a, axis=1))
print()

print('-' * 20)
print('b : \n',b)
print()
print('np.argmax(b) : \n',np.argmax(b))
print()
print('np.argmax(b, axis=0) : \n',np.argmax(b, axis=0))

print()
print('np.argmax(b, axis=1) : \n',np.argmax(b, axis=1))

print()
print('np.argmax(b, axis=2 : \n',np.argmax(b, axis=2))

--------------------
a :
[[1 8 3]]
[6 5 4]]
[7 2 9]]

np.argmax(a) :
8

np.argmax(a, axis=0) :
[2 0 2]]

np.argmax(a, axis=1) :
[1 0 2]

--------------------
b :
[[[1 8 3]]
[6 5 4]]
[7 2 9]]

[[1 9 4]]
[7 2 5]]
[6 8 3]]]

np.argmax(b) :
8

np.argmax(b, axis=0) :
[[0 1 1]]
[1 0 1]]
[0 1 0]]

np.argmax(b, axis=1) :
[[2 0 2]]
[1 0 1]]

np.argmax(b, axis=2 :
[[1 0 2]]
[1 0 1]]


### np.argmax(x)

Returns the position of the largest value in the array.

a = np.random.randint(100,size=10)

print('a : ',a)
print('max position : ',np.argmax(a))

a : [53 35 94 2 3 14 21 55 17 6].
max position : 2


### np.min(x)

Returns the minimum value of the array.

a = np.random.randint(100,size=10)

print('a : ',a)
print('min : ',np.min(a))

a : [36 42 6 71 92 23 44 92 36 79].
min : 6


### np.argmax(x)

Returns the position of the minimum array value.

a = np.random.randint(100,size=10)

print('a : ',a)
print('min position : ',np.argmin(a))

a : [51 76 59 12 28 50 21 61 49 37].
min position : 3


### np.maximum(x,y)

Compare the two arrays and create a new ndarray by selecting the larger value.

a = np.random.randint(100,size=10)
b = np.random.randint(100,size=10)

print('a : ',a)
print('b : ',b)
print('max : ',np.maximum(a,b))

a : [25 78 95 45 79 33 72 33 38 81].
b : [41 91 64 7 60 54 29 25 99 88]]
max : [41 91 95 45 79 54 72 33 99 88]]


### np.minimum(x,y)

Compares two arrays, selects the smaller value and creates a new ndarray.

a = np.random.randint(100,size=10)
b = np.random.randint(100,size=10)

print('a : ',a)
print('b : ',b)
print('min : ',np.minimum(a,b))

a : [80 81 40 80 47 81 17 86 91 63].
b : [84 51 7 4 62 66 83 85 21 66]]
min : [80 51 7 4 47 66 17 85 21 63]]


### np.sum(a, axis=None, dtype=None, out=None, keepdims=[no value], initial=[no value], where=[no value])

a = np.arange(10)
np.sum(a)

45


Try to calculate with axis.

a = np.arange(12).reshape(3,4)

print('a : ')
print(a)
print('sum axis=0 : ', np.sum(a, axis=0))
print('sum axis=1 : ', np.sum(a, axis=1))

a :
[[ 0 1 2 3]]
[ 4 5 6 7]
[ 8 9 10 11]]
sum axis=0 : [12 15 18 21]]
sum axis=1 : [ 6 22 38]]


### np.average(a, axis=None, weights=None, returned=False)

Find the average. You can also get a weighted average.

It is simply the average of the array.

a = np.arange(10)
np.average(a)

4.5


The average with axis.

a = np.arange(12).reshape(3,4)

print('a : ', a)
print('average axis = 0 : ',np.average(a, axis=0))
print('average axis = 1 : ',np.average(a, axis=1))

a : [[ 0 1 2 3]]
[ 4 5 6 7]
[ 8 9 10 11]]
average axis = 0 : [4. 5. 6. 7.]]
average axis = 1 : [1.5 5.5 9.5].


Specifies the weights.

a = np.arange(5)

# Set the weights as desired
w = np.array([0.1,0.2,0.5,0.15,0.05])

np.average(a,weights=w)

1.7619047619047616


### np.mean(a, axis=None, dtype=None, out=None, keepdims=[no value])

Find the average. It is not possible to obtain a weighted average here. However, you can specify the type of the calculation.

x = np.arange(10)
np.mean(x)

4.5


Compute with an integer type.

x = np.arange(10)
np.mean(x, dtype='int8')

array([4], dtype=int8)


### np.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=[no value])

Find the standard deviation.

x = np.arange(10)
np.std(x)

2.8722813232690143


### np.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=[no value])

Find the variance.

x = np.arange(10)
np.var(x)

8.25


### np.median(a, axis=None, out=None, overwrite_input=False, keepdims=False)

x = np.arange(10)
print(x)
print('median x : ',np.median(x))
print()

x = np.arange(11)
print(x)
print('median x : ',np.median(x))

[0 1 2 3 4 5 6 7 8 9]
median x : 4.5

[ 0 1 2 3 4 5 6 7 8 9 10]
median x : 5.0


### np.cov(m, y=None, rowvar=True, bias=False, ddof=None, fweights=None, aweights=None)

Find the sample variance with bias=True. Additional arrays can be specified by y.

a = np.random.randint(10,size=9).reshape(3,3)
b = np.arange(3)

print('a : ')
print(a)
print()

print('Covariance matrix with unbiased variance')
print(np.cov(a))
print()

print('Covariance matrix with sample variance')
print(np.cov(a, bias=True))
print()

print('Sample variance for each component : match diagonal components of covariance matrix')
print('var a[0] = ', np.var(a[0]))
print('var a[1] = ', np.var(a[1]))
print('var a[2] = ', np.var(a[2]))
print()

print('b : ')
print(b)
print(np.cov(a,b, bias=True))

a :
[[2 2 1]]
[0 1 6]]
[0 9 3]]

Covariance matrix with unbiased variance
[[ 0.333333333 -1.8333333333 0.5 ]]
[ -1.83333333 10.33333333 -0.5 ]
[ 0.5 -0.5 21.]]

Covariance matrix with sample variance
[[ 0.22222222 -1.22222222 0.333333333]
[-1.2222222222 6.888888889 -0.333333333]
[ 0.333333333 -0.3333333 14.]]

Sample variance of each component : Match the diagonal components of the covariance matrix
var a[0] = 0.2222222222222222
var a[1] = 6.888888888888888888
var a[2] = 14.0
b :
[0 1 2]
[[ 0.22222222 -1.22222222 0.333333333 -0.333333333]
[-1.22222222 6.888888889 -0.3333333 2.]
[ 0.333333333 -0.3333333 -0.3333333 14. 1. ]
[-0.333333333 2. 1. 0.666666667]]


### np.corrcoef(x, y=None, rowvar=True, bias=[no value], ddof=[no value])

a = np.random.randint(10,size=9).reshape(3,3)
np.corrcoef(a)

array([ 1. , 0.24019223, -0.75592895],
[ 0.24019223, 1. , -0.81705717],
[-0.75592895, -0.81705717, 1.]])