Python Tips
When I was doing data analysis, I had a chance to use Venn diagrams to visualize the data, so I’m writing this down so I don’t forget. The Venn diagram is a handy tool for visualizing the relationships between datasets, such as duplicates, and is useful in the EDA stage.
github
- The jupyter notebook format file on github is here .
google colaboratory
- If you want to run it on google colaboratory here
Author’s environment
sw_vers
ProductName: Mac OS X
ProductVersion: 10.14.6
BuildVersion: 18G103
Python -V
Python 3.8.5
Install venn
!pip install matplotlib-venn
Import venn2.
from matplotlib_venn import venn2
Create a Venn diagram using venn.
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import time
import json
import matplotlib.pyplot as plt
import numpy as np
import japanize_matplotlib
Prepare two suitable data sets.
g1 = set([i for i in range(0,50,1)])
g2 = set([i for i in range(40,90,1)])
plt.figure(figsize=(6,4))
plt.title('Venn diagram for Group:A and Group:B')
venn2(subsets=[set(g1),set(g2)],set_labels=('Group:A','Group:B'))
plt.show()
Three Venn diagrams
In venn, you can create Venn diagrams for three datasets by reading venn3.
from matplotlib_venn import venn3
Prepare three suitable data sets.
g1 = set([i for i in range(0,50,1)])
g2 = set([i for i in range(30,60,1)])
g3 = set([i for i in range(40,90,1)])
plt.figure(figsize=(6,4))
plt.title('Venn diagram for Group:A,Group:B,Group:C')
venn3(subsets=[set(g1),set(g2),set(g3)],set_labels=('Group:A','Group:B','Group:C'))
plt.show()
This is very useful, so don’t forget it.