Notes on how to use matplotlib
This is a memo on how to use matplotlib. I’ll update it as I learn new ways to use it.
github
- The file in jupyter notebook format is here
google colaboratory
- If you want to run it in google colaboratory here matplotlib/mat_nb.ipynb)
Author’s environment
The author’s OS is macOS, and the options are different from Linux and Unix commands.
! sw_vers
ProductName: Mac OS X
ProductVersion: 10.14.6
BuildVersion: 18G6020
Python -V
Python 3.7.3
Import the basic libraries and check their versions.
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib
import matplotlib.pyplot as plt
import scipy
import numpy as np
print('matplotlib version :', matplotlib.__version__)
print('scipy version :', scipy.__version__)
print('numpy version :', np.__version__)
matplotlib version : 3.0.3
scipy version : 1.4.1
numpy version : 1.16.2
Write a simple graph.
Let’s write a function $y = \sin x$. There are various maniacal ways to use this function, but in my work in data analysis, I feel that I use it most often in this form. Setting up the grid and labels is important, and I think it depends on your environment whether you can use tex in matplotlib or not.
- plt.grid()
- plt.title()
- plt.xlabel()
- plt.ylabe()
- plt.xlim()
- plt.legend()
x = np.linspace(0,10,100)
y = np.sin(x)
plt.grid()
plt.title("sin function")
plt.xlabel("$x$")
plt.ylabel("$y = \\sin(x)$")
plt.xlim(0,8)
plt.ylim(-1.2,1.2)
plt.plot(x,y,label="$y=\\sin x$")
plt.legend()
<matplotlib.legend.Legend at 0x1175fc4a8>
Write multiple graphs
x = np.linspace(0,10,100)
y1 = np.sin(x)
y2 = 0.8 * np.cos(x)
plt.grid()
plt.title("multi function")
plt.xlabel("$x$")
plt.xlim(0,8)
plt.ylim(-1.2,1.2)
plt.plot(x,y1, label="$y = \\sin x$")
plt.plot(x,y2, label="$y = 0.8 \\times \cos x$")
plt.legend()
<matplotlib.legend.Legend at 0x1177909b0>
Change graph linetype and color
x = np.linspace(0,10,100)
y1 = np.sin(x)
y2 = 0.8 * np.cos(x)
plt.grid()
plt.title("multi function")
plt.xlabel("$x$")
plt.xlim(0,8)
plt.ylim(-1.2,1.2)
plt.plot(x, y1, "o", color="red", label="$y = \\\sin x$")
plt.plot(x, y2, "x", color="blue", label="$y = \\sin x$")
plt.legend()
<matplotlib.legend.Legend at 0x117774898>
Create a histogram
This is a histogram of 10000 samples from a normal distribution. This is synonymous with visualizing the density distribution.
x = np.random.randn(10000)
plt.hist(x)
print(np.random.rand())
0.3934044138335787
You can set up a bin, for example. You can set up a bin, etc. You’ll learn how to do this naturally after a few times.
x = np.random.randn(10000)
plt.hist(x, bins=20,color="red")
(array([ 14., 23., 63., 155., 267., 480., 677., 981., 1277,
1287., 1249., 1162., 912., 645., 421., 222., 92., 42,
29., 2.]),
array([-3.30855493, -2.971971 , -2.63538708, -2.29880315, -1.96221922,
-1.62563529, -1.28905136, -0.95246744, -0.61588351, -0.27929958,
0.05728435, 0.39386828, 0.73045221, 1.06703613, 1.40362006,
1.74020399, 2.07678792, 2.41337185, 2.74995577, 3.0865397 ,
3.42312363]),
<a list of 20 Patch objects>)
Draw a three-dimensional graph
Although it is not that frequent, let’s try to draw a three-dimensional graph. Data analysis has hundreds of dimensions, but the number of dimensions that we can understand with our human senses is barely three. To be honest, I have a hard time even with 3 dimensions.
We will use a module called mplot3d. Also, specific to 3D graphs is the use of a numpy function called meshgrid.
Normally, to plot the plane $ z = x + y$ in $xyz$ space, the number of elements in $x$ and $y$ is determined by $N(x) \times N(y)$. Normally, you need to create an array for this amount, but meshgrid will automatically create it for you.
Let’s look at an example.
x = np.array([i for i in range(5)])
y = np.array([i for i in range(5)])
print('x :', x)
print('y :', y)
print()
xx, yy = np.array(np.meshgrid(x, y))
print('xx :', xx)
print()
print('yy :', yy)
x : [0 1 2 3 4] y : [0 1 2 3 4
y : [0 1 2 3 4]]
xx : [[0 1 2 3 4]]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]
[0 1 2 3 4]]
[0 1 2 3 4]]
[1 1 1 1 1]
[2 2 2 2 2 2]
[3 3 3 3 3 3]]
[4 4 4 4 4]]
Where $xx$ is the $x$-coordinate of the 25 coordinates. Similarly, $yy$ is the $y$-coordinate of the 25 coordinates. Very useful.
from mpl_toolkits.mplot3d import Axes3D
# Create a two-variable function as appropriate
def get_y(x1,x2):
return x1 + 3 * x2 * x2 + 1
x1 = np.linspace(0,10,20)
x2 = np.linspace(0,10,20)
X1, X2 = np.meshgrid(x1, x2)
Y = get_y(X1, X2)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_xlabel("$x_1$")
ax.set_ylabel("$x_2$")
ax.set_zlabel("$f(x_1, x_2)$")
ax.plot(np.travel(X1), np.travel(X2), np.travel(Y), "o", color='blue')
plt.show()
fig = plt.figure()
ax1 = fig.add_subplot(111, projection='3d')
ax1.set_xlabel("$x_1$")
ax1.set_ylabel("$x_2$")
ax1.set_zlabel("$f(x_1, x_2)$")
ax1.scatter3D(np.travel(X1), np.travel(X2), np.travel(Y))
plt.show()
For 3D plots, rather than messing with pyplot(plt) directly, you can create a figure object with fig=plt.figure()
and create the graph in it.
You can use plot_surface
to color the surface according to its value.
Let’s plot a multivariate Gaussian distribution.
Try to get the probability density of a Gaussian distribution with a slightly negative correlation.
From ``python from scipy.stats import multivariate_normal
mu = np.array([0,0]) sigma = np.array([[1,-0.8],[-0.8,1]])
x1 = np.linspace(-3,3,100) x2 = np.linspace(-3,3,100)
X = np.meshgrid(x1,x2)
X1, X2 = np.meshgrid(x1, x2) X = np.c_[np.travel(X1), np.travel(X2)]. Z = multivariate_normal.pdf(X, mu,sigma).reshape(100, -1)
fig = plt.figure() ax = fig.add_subplot(111, projection=‘3d’) ax.plot_surface(X1, X2, Z, cmap=‘bwr’, linewidth=0) fig.show()
/Users/hiroshi/anaconda3/lib/python3.7/site-packages/matplotlib/figure.py:445: UserWarning: Matplotlib is currently using module:// ipykernel.pylab.backend_inline, which is a non-GUI backend, so cannot show the figure.
% get_backend())