Basics of Image Processing with OpenCV

OpenCV is a computer vision library that is often used for image analysis and machine learning. It is a very easy to use library that provides not only basic image transformation, but also image filtering, face recognition, object recognition, object tracking, and other functions often used in practical applications. It is a library that I always use when I work with image recognition in practice.

In this article, I’ll leave you with some examples of basic image processing such as rotation, region detection, and blurring as a reminder.


  • The file in jupyter notebook format is here

google colaboratory

  • To run it on google colaboratory here convert/convert_nb.ipynb)


The author’s OS is macOS, and the options are different from those of Linux and Unix.

My environment

### Environment
ProductName: Mac OS X
ProductVersion: 10.14.6
BuildVersion: 18G95
Python -V
Python 3.5.5 :: Anaconda, Inc.
import cv2

print('opencv version :', cv2.__version__)
opencv version : 3.4.1

We also import matplotlib for image display. We will save the images as svg for better web appearance.

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

import matplotlib.pyplot as plt

Suppose you have a file called lena.jpg in the upper level.


ls -a ... / | grep jpg
filename = '. /lena.jpg'.

Loading an image

Let’s load an image and display it, using matplotlib to display it in jupyter notebook.

img = cv2.imread(filename=filename)

# In OpenCV, the image is loaded in GBR preparation, but in JupyterNotebook, it is displayed in RGB.
rgb_img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Make a grayscale image for later use
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

Get Image Information

Check the height, width and number of colors (usually 3 for RGB) of the image if it is color.

def get_image_info(img):
  if len(img.shape) == 3:
    img_height, img_width, img_channels = img.shape[:3].
    print('img_channels :', img_channels)
    img_height, img_width = img.shape[:2].

  print('img_height :', img_height)
  print('img_width :', img_width)

img_channels : 3
img_height : 225
img_width : 225

Save the image

Use the imwrite method.

out_filename = '. /lena_out.jpg'
cv2.imwrite(out_filename, img)

Rotate an image

Implements a function to rotate an image at an arbitrary size scale.

def rotate_image(img, scale, angle):

  if len(img.shape) == 3:
    img_height, img_width, img_channels = img.shape[:3].
    img_height, img_width = img.shape[:2].

  size = tuple([img_width, img_height])
  center = tuple([int(img_width / 2), int(img_height / 2)])

  rotation_matrix = cv2.getRotationMatrix2D(center, angle, scale)
  rotation_image = cv2.warpAffine(img, rotation_matrix, size, flags=cv2.INTER_CUBIC)

  # Sort BGR to RGB only when displaying in jupyter notebook
  rotation_rgb_img = cv2.cvtColor(rotation_image, cv2.COLOR_BGR2RGB)

  out_filename = '. /rotation_scale_{}_angle_{}.jpg'.format(scale, angle)
  cv2.imwrite(out_filename, rotation_image)

Rotate the image by 30 degrees around the center of the image.

rotate_image(img=img, scale=1, angle=30)

Double the size of the image and rotate it 30 degrees around the center of the image.

rotate_image(img=img, scale=2, angle=30)

Rotate the image around the center or -30 degrees.

rotate_image(img=img, scale=1, angle=-30)

Remove Margins Extracting Regions

OpenCV allows you to extract regions in an image. We will use the findContours method to extract the regions and remove the extra parts. In this example, we will use an image from one of our development projects instead of lena’s.

temp_img = cv2.imread(filename='... /10.svg')
contours_gray_img = cv2.cvtColor(temp_img, cv2.COLOR_BGR2GRAY)

def get_contours(img, off_set=1):
  _contours_image = img
  image, contours, hierarchy = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

  x1, y1, x2, y2 = [], [], [], [], []
  for i in range(1, len(contours)):
    # The contents of ret are (x, y, w, h)
    ret = cv2.boundingRect(contours[i])
    x2.append(ret[0] + ret[2])
    y2.append(ret[1] + ret[3])

  x1_min = min(x1)
  y1_min = min(y1)
  x2_max = max(x2)
  y2_max = max(y2)

  # Show the result of the borders
  cv2.rectangle(_contours_image, (x1_min, y1_min), (x2_max, y2_max), (0, 255, 0), 2)

  # Cut out the image with a little margin.
  y1_min -= off_set
  y2_max += off_set
  x1_min -= off_set
  x2_max += off_set
  _contours_image = img[y1_min:y2_max, x1_min:x2_max]



Gaussian filter

Takes the average of the surrounding images with the Gaussian function as weights. The kernel specifies the range to be averaged and the standard deviation of the Gaussian $\sigma$.

def show_gaussian_filter(img, average_size, sigma):
  _ret_image = cv2.GaussianBlur(img, (average_size, average_size), sigmaX=sigma, sigmaY=sigma)

  # Sort the BGR to RGB only when displaying in jupyter notebook
  ret_image = cv2.cvtColor(_ret_image, cv2.COLOR_BGR2RGB)
show_gaussian_filter(img, average_size=11, sigma=10)
show_gaussian_filter(img, average_size=21, sigma=10)
show_gaussian_filter(img, average_size=3, sigma=10)

Edge detection using the canny method

The canny method is used to detect the boundary (edge) of an object. It simply takes the gradient of the output value of a pixel to make a decision, but there are two thresholds and it takes some experience to use it in an optimized state. threshold2 is the threshold that determines whether or not an object is an edge, and in my experience, it is best to determine threshold2 and then determine threshold1. In my experience, it is best to determine threshold2 and then threshold1.

def show_canny_image(img, th1, th2, aperture):

  Canny(img, threshold1=th1, threshold2=th2, apertureSize=aperture): temp = cv2.

  # Sort the BGR to RGB only when displaying in jupyter notebook

Let’s see how the image changes with two thresholds, threshold1 and threshold2.

# fig, axes = plt.subplots(nrows=5, ncols=5, sharex=False)

fig = plt.figure(figsize=(10.0, 10.0))

cnt = 0
for th1 in range(0, 500, 100):
  for th2 in range(0, 500, 100):
    cnt += 1
    if th1 <= th2:

      temp = cv2.Canny(gray_img, threshold1=th1, threshold2=th2, apertureSize=3)

      fig.add_subplot(5, 5, cnt)

In the figure above, the horizontal axis is threshold2 and the vertical axis is threshold1. The further you go to the right and the lower you go, the larger the threshold value becomes. The lower right, the larger the threshold, the less of the area that is recognized as the boundary value, and the more prominent the black becomes. However, to be honest, I can’t really tell the difference between threshold1 and threshold2 in this case.