Basics of Image Processing with OpenCV
OpenCV is a computer vision library that is often used for image analysis and machine learning. It is a very easy to use library that provides not only basic image transformation, but also image filtering, face recognition, object recognition, object tracking, and other functions often used in practical applications. It is a library that I always use when I work with image recognition in practice.
In this article, I’ll leave you with some examples of basic image processing such as rotation, region detection, and blurring as a reminder.
github
- The file in jupyter notebook format is here
google colaboratory
- To run it on google colaboratory here convert/convert_nb.ipynb)
Environment
The author’s OS is macOS, and the options are different from those of Linux and Unix.
My environment
### Environment
ProductName: Mac OS X
ProductVersion: 10.14.6
BuildVersion: 18G95
Python -V
Python 3.5.5 :: Anaconda, Inc.
import cv2
print('opencv version :', cv2.__version__)
opencv version : 3.4.1
We also import matplotlib for image display. We will save the images as svg for better web appearance.
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
import matplotlib.pyplot as plt
Suppose you have a file called lena.jpg in the upper level.
%%bash
ls -a ... / | grep jpg
binary_out.jpg
bitwise_out.jpg
gray_out.jpg
lena.jpg
lena_out.jpg
rotation.jpg
rotation_scale_1_angle_-30.jpg
rotation_scale_1_angle_-30.jpg
rotation_scale_2_angle_-30.jpg
rotation_scale_2_angle_-30.jpg
filename = '. /lena.jpg'.
Loading an image
Let’s load an image and display it, using matplotlib to display it in jupyter notebook.
img = cv2.imread(filename=filename)
# In OpenCV, the image is loaded in GBR preparation, but in JupyterNotebook, it is displayed in RGB.
rgb_img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(rgb_img)
plt.show()
# Make a grayscale image for later use
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
Get Image Information
Check the height, width and number of colors (usually 3 for RGB) of the image if it is color.
def get_image_info(img):
if len(img.shape) == 3:
img_height, img_width, img_channels = img.shape[:3].
print('img_channels :', img_channels)
else:
img_height, img_width = img.shape[:2].
print('img_height :', img_height)
print('img_width :', img_width)
get_image_info(img=img)
img_channels : 3
img_height : 225
img_width : 225
Save the image
Use the imwrite method.
out_filename = '. /lena_out.jpg'
cv2.imwrite(out_filename, img)
True
Rotate an image
Implements a function to rotate an image at an arbitrary size scale.
def rotate_image(img, scale, angle):
if len(img.shape) == 3:
img_height, img_width, img_channels = img.shape[:3].
else:
img_height, img_width = img.shape[:2].
size = tuple([img_width, img_height])
center = tuple([int(img_width / 2), int(img_height / 2)])
rotation_matrix = cv2.getRotationMatrix2D(center, angle, scale)
rotation_image = cv2.warpAffine(img, rotation_matrix, size, flags=cv2.INTER_CUBIC)
# Sort BGR to RGB only when displaying in jupyter notebook
rotation_rgb_img = cv2.cvtColor(rotation_image, cv2.COLOR_BGR2RGB)
plt.imshow(rotation_rgb_img)
plt.show()
out_filename = '. /rotation_scale_{}_angle_{}.jpg'.format(scale, angle)
cv2.imwrite(out_filename, rotation_image)
Rotate the image by 30 degrees around the center of the image.
rotate_image(img=img, scale=1, angle=30)
Double the size of the image and rotate it 30 degrees around the center of the image.
rotate_image(img=img, scale=2, angle=30)
Rotate the image around the center or -30 degrees.
rotate_image(img=img, scale=1, angle=-30)
Remove Margins Extracting Regions
OpenCV allows you to extract regions in an image. We will use the findContours method to extract the regions and remove the extra parts. In this example, we will use an image from one of our development projects instead of lena’s.
temp_img = cv2.imread(filename='... /10.svg')
contours_gray_img = cv2.cvtColor(temp_img, cv2.COLOR_BGR2GRAY)
plt.imshow(contours_gray_img)
plt.gray()
plt.show()
def get_contours(img, off_set=1):
_contours_image = img
image, contours, hierarchy = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
x1, y1, x2, y2 = [], [], [], [], []
for i in range(1, len(contours)):
# The contents of ret are (x, y, w, h)
ret = cv2.boundingRect(contours[i])
x1.append(ret[0])
y1.append(ret[1])
x2.append(ret[0] + ret[2])
y2.append(ret[1] + ret[3])
x1_min = min(x1)
y1_min = min(y1)
x2_max = max(x2)
y2_max = max(y2)
# Show the result of the borders
cv2.rectangle(_contours_image, (x1_min, y1_min), (x2_max, y2_max), (0, 255, 0), 2)
# Cut out the image with a little margin.
y1_min -= off_set
y2_max += off_set
x1_min -= off_set
x2_max += off_set
_contours_image = img[y1_min:y2_max, x1_min:x2_max]
plt.imshow(_contours_image)
plt.gray()
plt.show()
get_contours(img=contours_gray_img)
Gaussian filter
Takes the average of the surrounding images with the Gaussian function as weights. The kernel specifies the range to be averaged and the standard deviation of the Gaussian $\sigma$.
def show_gaussian_filter(img, average_size, sigma):
_ret_image = cv2.GaussianBlur(img, (average_size, average_size), sigmaX=sigma, sigmaY=sigma)
# Sort the BGR to RGB only when displaying in jupyter notebook
ret_image = cv2.cvtColor(_ret_image, cv2.COLOR_BGR2RGB)
plt.imshow(ret_image)
plt.show()
show_gaussian_filter(img, average_size=11, sigma=10)
show_gaussian_filter(img, average_size=21, sigma=10)
show_gaussian_filter(img, average_size=3, sigma=10)
Edge detection using the canny method
The canny method is used to detect the boundary (edge) of an object. It simply takes the gradient of the output value of a pixel to make a decision, but there are two thresholds and it takes some experience to use it in an optimized state. threshold2 is the threshold that determines whether or not an object is an edge, and in my experience, it is best to determine threshold2 and then determine threshold1. In my experience, it is best to determine threshold2 and then threshold1.
def show_canny_image(img, th1, th2, aperture):
Canny(img, threshold1=th1, threshold2=th2, apertureSize=aperture): temp = cv2.
# Sort the BGR to RGB only when displaying in jupyter notebook
plt.imshow(temp)
plt.gray()
plt.show()
Let’s see how the image changes with two thresholds, threshold1 and threshold2.
# fig, axes = plt.subplots(nrows=5, ncols=5, sharex=False)
fig = plt.figure(figsize=(10.0, 10.0))
cnt = 0
for th1 in range(0, 500, 100):
for th2 in range(0, 500, 100):
cnt += 1
if th1 <= th2:
temp = cv2.Canny(gray_img, threshold1=th1, threshold2=th2, apertureSize=3)
fig.add_subplot(5, 5, cnt)
plt.gray()
plt.axis('off')
plt.imshow(temp)
plt.show()
In the figure above, the horizontal axis is threshold2 and the vertical axis is threshold1. The further you go to the right and the lower you go, the larger the threshold value becomes. The lower right, the larger the threshold, the less of the area that is recognized as the boundary value, and the more prominent the black becomes. However, to be honest, I can’t really tell the difference between threshold1 and threshold2 in this case.