Numpy personal tips
numpy is one of the essential tools for data analysis and numerical computation. It is a library that is always needed when implementing machine learning, etc. I’ll leave a memo as a personal reminder. For details, please refer to the following official page.
Contents
- 1. basic operations
- 2. Trigonometric Functions
- 3. exponential and logarithmic
- 4. statistical functions <= here and now
- 5. linear algebra
- 6. sampling
- 7. Miscellaneous
github
- The file in jupyter notebook format on github is here .
Author’s environment
The author’s environment and import method are as follows.
!sw_vers
ProductName: Mac OS X
ProductVersion: 10.14.6
BuildVersion: 18G2022
Python -V
Python 3.7.3
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import numpy as np
np.__version__
'1.16.2'
Get statistics.
np.max(x)
Returns the maximum value of an array.
Define $a$ as a second-order tensor.
a = np.array([.
[1,8,3],
[6,5,4],
[7,2,9].
]
)
Define $b$ as a third-order tensor.
b = np.array([
[
[1,8,3],
[6,5,4],
[7,2,9]]
],
[
[1,9,4],
[7,2,5],
[6,8,3]
]
])
print('-' * 20)
print('a : \n',a)
print()
print('np.max(a) : \n',np.max(a))
print()
print('np.max(a, axis=0) : \n',np.max(a, axis=0))
print()
print('np.max(a, axis=1) : \n',np.max(a, axis=1))
print()
print('-' * 20)
print('b : \n',b)
print()
print('np.max(b) : \n',np.max(b))
print()
print('np.max(b, axis=0) : \n',np.max(b, axis=0))
print()
print('np.max(b, axis=1) : \n',np.max(b, axis=1))
print()
print('np.max(b, axis=2 : \n',np.max(b, axis=2))
--------------------
a :
[[1 8 3]]
[6 5 4]]
[7 2 9]]
np.max(a) :
9
np.max(a, axis=0) :
[7 8 9]]
np.max(a, axis=1) :
[8 6 9]
--------------------
b :
[[[1 8 3]]
[6 5 4]]
[7 2 9]]
[[1 9 4]]
[7 2 5]]
[6 8 3]]]
np.max(b) :
9
np.max(b, axis=0) :
[[1 9 4]]
[7 5 5]]
[7 8 9]]
np.max(b, axis=1) :
[[7 8 9]]
[7 9 5]]
np.max(b, axis=2 :
[[8 6 9]]
[9 7 8]]
print('-' * 20)
print('a : \n',a)
print()
print('np.argmax(a) : \n',np.argmax(a))
print()
print('np.argmax(a, axis=0) : \n',np.argmax(a, axis=0))
print()
print('np.argmax(a, axis=1) : \n',np.argmax(a, axis=1))
print()
print('-' * 20)
print('b : \n',b)
print()
print('np.argmax(b) : \n',np.argmax(b))
print()
print('np.argmax(b, axis=0) : \n',np.argmax(b, axis=0))
print()
print('np.argmax(b, axis=1) : \n',np.argmax(b, axis=1))
print()
print('np.argmax(b, axis=2 : \n',np.argmax(b, axis=2))
--------------------
a :
[[1 8 3]]
[6 5 4]]
[7 2 9]]
np.argmax(a) :
8
np.argmax(a, axis=0) :
[2 0 2]]
np.argmax(a, axis=1) :
[1 0 2]
--------------------
b :
[[[1 8 3]]
[6 5 4]]
[7 2 9]]
[[1 9 4]]
[7 2 5]]
[6 8 3]]]
np.argmax(b) :
8
np.argmax(b, axis=0) :
[[0 1 1]]
[1 0 1]]
[0 1 0]]
np.argmax(b, axis=1) :
[[2 0 2]]
[1 0 1]]
np.argmax(b, axis=2 :
[[1 0 2]]
[1 0 1]]
np.argmax(x)
Returns the position of the largest value in the array.
a = np.random.randint(100,size=10)
print('a : ',a)
print('max position : ',np.argmax(a))
a : [53 35 94 2 3 14 21 55 17 6].
max position : 2
np.min(x)
Returns the minimum value of the array.
a = np.random.randint(100,size=10)
print('a : ',a)
print('min : ',np.min(a))
a : [36 42 6 71 92 23 44 92 36 79].
min : 6
np.argmax(x)
Returns the position of the minimum array value.
a = np.random.randint(100,size=10)
print('a : ',a)
print('min position : ',np.argmin(a))
a : [51 76 59 12 28 50 21 61 49 37].
min position : 3
np.maximum(x,y)
Compare the two arrays and create a new ndarray by selecting the larger value.
a = np.random.randint(100,size=10)
b = np.random.randint(100,size=10)
print('a : ',a)
print('b : ',b)
print('max : ',np.maximum(a,b))
a : [25 78 95 45 79 33 72 33 38 81].
b : [41 91 64 7 60 54 29 25 99 88]]
max : [41 91 95 45 79 54 72 33 99 88]]
np.minimum(x,y)
Compares two arrays, selects the smaller value and creates a new ndarray.
a = np.random.randint(100,size=10)
b = np.random.randint(100,size=10)
print('a : ',a)
print('b : ',b)
print('min : ',np.minimum(a,b))
a : [80 81 40 80 47 81 17 86 91 63].
b : [84 51 7 4 62 66 83 85 21 66]]
min : [80 51 7 4 47 66 17 85 21 63]]
np.sum(a, axis=None, dtype=None, out=None, keepdims=[no value], initial=[no value], where=[no value])
a = np.arange(10)
np.sum(a)
45
Try to calculate with axis.
a = np.arange(12).reshape(3,4)
print('a : ')
print(a)
print('sum axis=0 : ', np.sum(a, axis=0))
print('sum axis=1 : ', np.sum(a, axis=1))
a :
[[ 0 1 2 3]]
[ 4 5 6 7]
[ 8 9 10 11]]
sum axis=0 : [12 15 18 21]]
sum axis=1 : [ 6 22 38]]
np.average(a, axis=None, weights=None, returned=False)
Find the average. You can also get a weighted average.
It is simply the average of the array.
a = np.arange(10)
np.average(a)
4.5
The average with axis.
a = np.arange(12).reshape(3,4)
print('a : ', a)
print('average axis = 0 : ',np.average(a, axis=0))
print('average axis = 1 : ',np.average(a, axis=1))
a : [[ 0 1 2 3]]
[ 4 5 6 7]
[ 8 9 10 11]]
average axis = 0 : [4. 5. 6. 7.]]
average axis = 1 : [1.5 5.5 9.5].
Specifies the weights.
a = np.arange(5)
# Set the weights as desired
w = np.array([0.1,0.2,0.5,0.15,0.05])
np.average(a,weights=w)
1.7619047619047616
np.mean(a, axis=None, dtype=None, out=None, keepdims=[no value])
Find the average. It is not possible to obtain a weighted average here. However, you can specify the type of the calculation.
x = np.arange(10)
np.mean(x)
4.5
Compute with an integer type.
x = np.arange(10)
np.mean(x, dtype='int8')
array([4], dtype=int8)
np.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=[no value])
Find the standard deviation.
x = np.arange(10)
np.std(x)
2.8722813232690143
np.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=[no value])
Find the variance.
x = np.arange(10)
np.var(x)
8.25
np.median(a, axis=None, out=None, overwrite_input=False, keepdims=False)
x = np.arange(10)
print(x)
print('median x : ',np.median(x))
print()
x = np.arange(11)
print(x)
print('median x : ',np.median(x))
[0 1 2 3 4 5 6 7 8 9]
median x : 4.5
[ 0 1 2 3 4 5 6 7 8 9 10]
median x : 5.0
np.cov(m, y=None, rowvar=True, bias=False, ddof=None, fweights=None, aweights=None)
Find the sample variance with bias=True. Additional arrays can be specified by y.
a = np.random.randint(10,size=9).reshape(3,3)
b = np.arange(3)
print('a : ')
print(a)
print()
print('Covariance matrix with unbiased variance')
print(np.cov(a))
print()
print('Covariance matrix with sample variance')
print(np.cov(a, bias=True))
print()
print('Sample variance for each component : match diagonal components of covariance matrix')
print('var a[0] = ', np.var(a[0]))
print('var a[1] = ', np.var(a[1]))
print('var a[2] = ', np.var(a[2]))
print()
print('Add b')
print('b : ')
print(b)
print(np.cov(a,b, bias=True))
a :
[[2 2 1]]
[0 1 6]]
[0 9 3]]
Covariance matrix with unbiased variance
[[ 0.333333333 -1.8333333333 0.5 ]]
[ -1.83333333 10.33333333 -0.5 ]
[ 0.5 -0.5 21.]]
Covariance matrix with sample variance
[[ 0.22222222 -1.22222222 0.333333333]
[-1.2222222222 6.888888889 -0.333333333]
[ 0.333333333 -0.3333333 14.]]
Sample variance of each component : Match the diagonal components of the covariance matrix
var a[0] = 0.2222222222222222
var a[1] = 6.888888888888888888
var a[2] = 14.0
Add b
b :
[0 1 2]
[[ 0.22222222 -1.22222222 0.333333333 -0.333333333]
[-1.22222222 6.888888889 -0.3333333 2.]
[ 0.333333333 -0.3333333 -0.3333333 14. 1. ]
[-0.333333333 2. 1. 0.666666667]]
np.corrcoef(x, y=None, rowvar=True, bias=[no value], ddof=[no value])
a = np.random.randint(10,size=9).reshape(3,3)
np.corrcoef(a)
array([ 1. , 0.24019223, -0.75592895],
[ 0.24019223, 1. , -0.81705717],
[-0.75592895, -0.81705717, 1.]])