Home Data Python Image Processing Cookbook

Python Image Processing Cookbook

By Sandipan Dey
books-svg-icon Book
eBook $35.99 $24.99
Print $48.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $35.99 $24.99
Print $48.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    Image Enhancement
About this book
With the advancements in wireless devices and mobile technology, there's increasing demand for people with digital image processing skills in order to extract useful information from the ever-growing volume of images. This book provides comprehensive coverage of the relevant tools and algorithms, and guides you through analysis and visualization for image processing. With the help of over 60 cutting-edge recipes, you'll address common challenges in image processing and learn how to perform complex tasks such as object detection, image segmentation, and image reconstruction using large hybrid datasets. Dedicated sections will also take you through implementing various image enhancement and image restoration techniques, such as cartooning, gradient blending, and sparse dictionary learning. As you advance, you'll get to grips with face morphing and image segmentation techniques. With an emphasis on practical solutions, this book will help you apply deep learning techniques such as transfer learning and fine-tuning to solve real-world problems. By the end of this book, you'll be proficient in utilizing the capabilities of the Python ecosystem to implement various image processing techniques effectively.
Publication date:
April 2020
Publisher
Packt
Pages
438
ISBN
9781789537147

 

Image Manipulation and Transformation

Image transformation is the art of transforming an image. With image transformation and manipulation, we can enhance the appearance of an image. The transformation and manipulation operation can also be used as preprocessing steps for more complex image processing tasks, such as classification or segmentation, which you will get more acquainted with in later chapters. In this chapter, you are going to learn how to use different Python libraries (NumPy, SciPy, scikit-image, OpenCV-Python, Mahotas, and Matplotlib) for image manipulation and transformation. Different recipes will help you to learn how to write Python code to implement color space transformation, geometric transformations, perspective transforms/homography, and so on.

In this chapter, we will cover the following recipes:

  • Transforming color space (RGB → Lab)
  • Applying affine transformation
  • Applying perspective transformation and homography
  • Creating pencil sketches from images
  • Creating cartoonish images
  • Simulating light art/long exposure
  • Object detection using color in HSV
 

Technical requirements

To run the codes without any errors, you need to first install Python 3 (for example, 3.6) and the required libraries, if they are not already installed. If you are working on Windows, you are recommended to install the Anaconda distribution. You also need to install the jupyter library to work with the notebooks.

All of the code files in this book are available in the GitHub repository at https://github.com/PacktPublishing/Python-Image-Processing-Cookbook. You should clone the repository (to your working directory). Corresponding to each chapter, there is a folder and each folder contains a notebook with the complete code (for all of the recipes for each chapter); a subfolder named images, which contains all the input images (and related files) required for that chapter; and (optionally) another sub-folder named models, which contains the models and related files to be used for the recipes in that chapter.

 

Transforming color space (RGB → Lab)

The CIELAB (abbreviated as Lab) color space consists of three color channels, expressing the color of a pixel as three tuples (L, a, b), where the L channel stands for luminosity/illumination/intensity (lightness). The a and b channels represent the green-red and blue-yellow color components, respectively. This color model separates the intensity from the colors completely. It's device-independent and has a large gamut. In this recipe, you will see how to convert from RGB into the Lab color space and vice versa and the usefulness of this color model.

Getting ready

In this recipe, we will use a flower RGB image as the input image. Let's start by importing the required libraries with the following code block:

import numpy as np
from skimage.io import imread
from skimage.color import rgb2lab, lab2rgb
import matplotlib.pylab as plt

How to do it...

In this recipe, you will see a few remarkable uses of the Lab color space and how it makes some image manipulation operations easy and elegant.

Converting RGB image into grayscale by setting the Lab space color channels to zero

Perform the following steps to convert an RGB color image into a grayscale image using the Lab color space and scikit-image library functions:

  1. Read the input image. Perform a color space transformation—from RGB to Lab color space:
im = imread('images/flowers.png')
im1 = rgb2lab(im)
  1. Set the color channel values (the second and third channels) to zeros:
im1[...,1] = im1[...,2] = 0
  1. Obtain the grayscale image by converting the image back into the RGB color space from the Lab color space:
im1 = lab2rgb(im1)
  1. Plot the input and output images, as shown in the following code:
plt.figure(figsize=(20,10))
plt.subplot(121), plt.imshow(im), plt.axis('off'), plt.title('Original image', size=20)
plt.subplot(122), plt.imshow(im1), plt.axis('off'), plt.title('Gray scale image', size=20)
plt.show()

The following screenshot shows the output of the preceding code block:

Changing the brightness of the image by varying the luminosity channel

Perform the following steps to change the brightness of a colored image using the Lab color space and scikit-image library functions:

  1. Convert the input image from RGB into the Lab color space and increase the first channel values (by 50):
im1 = rgb2lab(im)
im1[...,0] = im1[...,0] + 50
  1. Convert it back into the RGB color space and obtain a brighter image:
im1 = lab2rgb(im1)
  1. Convert the RGB image into the Lab color space and decrease only the first channel values (by 50, as seen in the following code) and then convert back into the RGB color space to get a darker image instead:
im1 = rgb2lab(im)
im1[...,0] = im1[...,0] - 50
im1 = lab2rgb(im1)

If you run the preceding code and plot the input and output images, you will get an output similar to the one shown in the following screenshot:

How it works...

The rgb2lab() function from the scikit-image color module was used to convert an image from RGB into the Lab color space.

The modified image in the Lab color space was converted back into RGB using the lab2rgb() function from the scikit-image color module.

Since the color channels are separated in the a and b channels and in terms of intensity in the L channel by setting the color channel values to zero, we can obtain the grayscale image from a colored image in the Lab space.

The brightness of the input color image was changed by changing only the L channel values in the Lab space (unlike in the RGB color space where all the channel values need to be changed); there is no need to touch the color channels.

There's more...

There are many other uses of the Lab color space. For example, you can obtain a more natural inverted image in the Lab space since only the luminosity channel needs to be inverted, as demonstrated in the following code block:

im1 = rgb2lab(im)
im1[...,0] = np.max(im1[...,0]) - im1[...,0]
im1 = lab2rgb(im1)

If you run the preceding code and display the input image and the inverted images obtained in the RGB and the Lab space, you will get the following screenshot:

As you can see, the Inverted image in the Lab color space appears much more natural than the Inverted image in the RGB color space.

 

Applying affine transformation

An affine transformation is a geometric transformation that preserves points, straight lines, and planes. Lines that are parallel before the transform remain parallel post-application of the transform. For every pixel x in an image, the affine transformation can be represented by the mapping, x |→ Mx+b, where M is a linear transform (matrix) and b is an offset vector.

In this recipe, we will use the scipy ndimage library function, affine_transform(), to implement such a transformation on an image.

Getting ready

First, let's import the libraries and the functions required to implement an affine transformation on a grayscale image:

import numpy as np
from scipy import ndimage as ndi
from skimage.io import imread
from skimage.color import rgb2gray

How to do it...

Perform the following steps to apply an affine transformation to an image using the scipy.ndimage module functions:

  1. Read the color image, convert it into grayscale, and obtain the grayscale image shape:
img = rgb2gray(imread('images/humming.png'))
w, h = img.shape
  1. Apply identity transform:
mat_identity = np.array([[1,0,0],[0,1,0],[0,0,1]])
img1 = ndi.affine_transform(img, mat_identity)
  1. Apply reflection transform (along the x axis):
mat_reflect = np.array([[1,0,0],[0,-1,0],[0,0,1]]) @ np.array([[1,0,0],[0,1,-h],[0,0,1]])
img1 = ndi.affine_transform(img, mat_reflect) # offset=(0,h)
  1. Scale the image (0.75 times along the x axis and 1.25 times along the y axis):
s_x, s_y = 0.75, 1.25
mat_scale = np.array([[s_x,0,0],[0,s_y,0],[0,0,1]])
img1 = ndi.affine_transform(img, mat_scale)

  1. Rotate the image by 30° counter-clockwise. It's a composite operation—first, you will need to shift/center the image, apply rotation, and then apply inverse shift:
theta = np.pi/6
mat_rotate = np.array([[1,0,w/2],[0,1,h/2],[0,0,1]]) @ np.array([[np.cos(theta),np.sin(theta),0],[np.sin(theta),-np.cos(theta),0],[0,0,1]]) @ np.array([[1,0,-w/2],[0,1,-h/2],[0,0,1]])
img1 = ndi.affine_transform(img1, mat_rotate)
  1. Apply shear transform to the image:
lambda1 = 0.5
mat_shear = np.array([[1,lambda1,0],[lambda1,1,0],[0,0,1]])
img1 = ndi.affine_transform(img1, mat_shear)
  1. Finally apply all of the transforms together, in sequence:
mat_all = mat_identity @ mat_reflect @ mat_scale @ mat_rotate @ mat_shear
ndi.affine_transform(img, mat_all)

The following screenshot shows the matrices (M) for each of the affine transformation operations:

How it works...

Note that, for an image, the x axis is the vertical (+ve downward) axis and the y axis is the horizontal (+ve left-to-right) axis.

With the affine_transform() function, the pixel value at location o in the output (transformed) image is determined from the pixel value in the input image at position np.dot(matrix, o) + offset. Hence, the matrix that needs to be provided as input to the function is actually the inverse transformation matrix.

In some of the cases, an additional matrix is used for translation, to bring the transformed image within the frame of visualization.

The preceding code snippets show how to implement different affine transformations such as reflection, scaling, rotation, and shear using the affine_transform() function. We need to provide the proper transformation matrix, M (shown in the preceding diagram) for each of these cases (homogeneous coordinates are used).

We can use the product of all of the matrices to perform a combination of all of the affine transformations at once (for instance, if you want transformation T1 followed by T2, you need to multiply the input image by the matrix T2.T1).

If all of the transformations are applied in sequence and the transformed images are plotted one by one, you will obtain an output like the following screenshot:

There's more...

Again, in the previous example, the affine_transform() function was applied to a grayscale image. The same effect can be obtained with a color image also, such as by applying the mapping function to each of the image channels simultaneously and independently. Also, the scikit-image library provides the AffineTransform and PiecewiseAffineTransform classes; you may want to try them to implement affine transformation as well.

 

Applying perspective transformation and homography

The goal of perspective (projective) transform is to estimate homography (a matrix, H) from point correspondences between two images. Since the matrix has a Depth Of Field (DOF) of eight, you need at least four pairs of points to compute the homography matrix from two images. The following diagram shows the basic concepts required to compute the homography matrix:

Fortunately, we don't need to compute the SVD and the H matrix is computed automatically by the ProjectiveTransform function from the scikit-image transform module. In this recipe, we will use this function to implement homography.

Getting ready

We will use a humming bird's image and an image of an astronaut on the moon (taken from NASA's public domain images) as input images in this recipe. Again, let's start by importing the required libraries as usual:

from skimage.transform import ProjectiveTransform
from skimage.io import imread
import numpy as np
import matplotlib.pylab as plt

How to do it...

Perform the following steps to apply a projective transformation to an image using the transform module from scikit-image:

  1. First, read the source image and create a destination image with the np.zeros() function:
im_src = (imread('images/humming2.png'))
height, width, dim = im_src.shape
im_dst = np.zeros((height, width, dim))
  1. Create an instance of the ProjectiveTransform class:
pt = ProjectiveTransform()

  1. You just need to provide four pairs of matching points between the source and destination images to estimate the homography matrix, H, automatically for you. Here, the four corners of the destination image and the four corners of the input hummingbird are provided as matching points, as shown in the following code block:
src = np.array([[ 295., 174.],
[ 540., 146. ],
[ 400., 777.],
[ 60., 422.]])
dst = np.array([[ 0., 0.],
[height-1, 0.],
[height-1, width-1],
[ 0., width-1]])
  1. Obtain the source pixel index corresponding to each pixel index in the destination:
x, y = np.mgrid[:height, :width]
dst_indices = np.hstack((x.reshape(-1, 1), y.reshape(-1,1)))
src_indices = np.round(pt.inverse(dst_indices), 0).astype(int)
valid_idx = np.where((src_indices[:,0] < height) & (src_indices[:,1] < width) &
(src_indices[:,0] >= 0) & (src_indices[:,1] >= 0))
dst_indicies_valid = dst_indices[valid_idx]
src_indicies_valid = src_indices[valid_idx]
  1. Copy pixels from the source to the destination images:
im_dst[dst_indicies_valid[:,0],dst_indicies_valid[:,1]] =       
im_src[src_indicies_valid[:,0],src_indicies_valid[:,1]]

If you run the preceding code snippets, you will get an output like the following screenshot:

The next screenshot shows the source image of an astronaut on the moon and the destination image of the canvas. Again, by providing four pairs of mapping points in between the source (corner points) and destination (corners of the canvas), the task is pretty straightforward:

The following screenshot shows the output image after the projective transform:

How it works...

In both of the preceding cases, the input image is projected onto the desired location of the output image.

A ProjectiveTransform object is needed to be created first to apply perspective transform to an image.

A set of 4-pixel positions from the source image and corresponding matching pixel positions in the destination image are needed to be passed to the estimate() function along with the object instance and this computes the homography matrix, H (and returns True if it can be computed).

The inverse() function is to be called on the object and this will give you the source pixel indices corresponding to all destination pixel indices.

There's more...

You can use the warp() function (instead of the inverse() function) to implement homography/projective transform.

See also

 

Creating pencil sketches from images

Producing sketches from images is all about detecting edges in images. In this recipe, you will learn how to use different techniques, including the difference of Gaussian (and its extended version, XDOG), anisotropic diffusion, and dodging (applying Gaussian blur + invert + thresholding), to obtain sketches from images.

Getting ready

The following libraries need to be imported first:

import numpy as np
from skimage.io import imread
from skimage.color import rgb2gray
from skimage import util
from skimage import img_as_float
import matplotlib.pylab as plt
from medpy.filter.smoothing import anisotropic_diffusion
from skimage.filters import gaussian, threshold_otsu

How to do it...

The following steps need to be performed:

  1. Define the normalize() function to implement min-max normalization in an image:
def normalize(img):
return (img-np.min(img))/(np.max(img)-np.min(img))
  1. Implement the sketch() function that takes an image and the extracted edges as input:
def sketch(img, edges):
output = np.multiply(img, edges)
output[output>1]=1
output[edges==1]=1
return output

  1. Implement a function to extract the edges from an image with anisotropic diffusion:
def edges_with_anisotropic_diffusion(img, niter=100, kappa=10, gamma=0.1):
output = img - anisotropic_diffusion(img, niter=niter, \
kappa=kappa, gamma=gamma, voxelspacing=None, \
option=1)
output[output > 0] = 1
output[output < 0] = 0
return output
  1. Implement a function to extract the edges from an image with the dodge operation (there are two implementations):
def sketch_with_dodge(img):
orig = img
blur = gaussian(util.invert(img), sigma=20)
result = blur / util.invert(orig)
result[result>1] = 1
result[orig==1] = 1
return result

def edges_with_dodge2(img):
img_blurred = gaussian(util.invert(img), sigma=5)
output = np.divide(img, util.invert(img_blurred) + 0.001)
output = normalize(output)
thresh = threshold_otsu(output)
output = output > thresh
return output
  1. Implement a function to extract the edges from an image with a Difference of Gaussian (DOG) operation:
def edges_with_DOG(img, k = 200, gamma = 1):
sigma = 0.5
output = gaussian(img, sigma=sigma) - gamma*gaussian(img, \
sigma=k*sigma)
output[output > 0] = 1
output[output < 0] = 0
return output

  1. Implement a function to produce sketches from an image with an Extended Difference of Gaussian (XDOG) operation:
def sketch_with_XDOG(image, epsilon=0.01):
phi = 10
difference = edges_with_DOG(image, 200, 0.98).astype(np.uint8)
for i in range(0, len(difference)):
for j in range(0, len(difference[0])):
if difference[i][j] >= epsilon:
difference[i][j] = 1
else:
ht = np.tanh(phi*(difference[i][j] - epsilon))
difference[i][j] = 1 + ht
difference = normalize(difference)
return difference

If you run the preceding code and plot all of the input/output images, you will obtain an output like the following screenshot:

How it works...

As you can see from the previous section, many of the sketching techniques work by blurring the edges (for example, with Gaussian filter or diffusion) in the image and removing details to some extent and then subtracting the original image to get the sketch outlines.

The gaussian() function from the scikit-image filters module was used to blur the images.

The anisotropic_diffusion() function from the filter.smoothing module of the medpy library was used to find edges with anisotropic diffusion (a variational method).

The dodge operation divides (using np.divide()) the image by the inverted blurred image. This highlights the boldest edges in the image.

There's more...

There are a few more edge detection techniques, such as Canny (with hysteresis thresholds), that you can try to produce sketches from images. You can try them on your own and compare the sketches obtained using different algorithms. Also, by using OpenCV-Python's pencilSketch() and sylization() functions, you can produce black and white and color pencil sketches, as well as watercolor-like stylized images, with the following few lines of code:

import cv2
import matplotlib.pylab as plt
src = cv2.imread('images/bird.png')
#dst = cv2.detailEnhance(src, sigma_s=10, sigma_r=0.15)
dst_sketch, dst_color_sketch = cv2.pencilSketch(src, sigma_s=50, sigma_r=0.05, shade_factor=0.05)
dst_water_color = cv2.stylization(src, sigma_s=50, sigma_r=0.05)

If you run this code and plot the images, you will get a diagram similar to the following screenshot:

See also

 

Creating cartoonish images

In this recipe, you will learn how to create cartoonish flat-textured images from an image. Again, there is more than one way to do the same; here, we will learn how to do it with edge-preserving bilateral filters.

Getting ready

The following libraries need to be imported first:

import cv2
import numpy as np
import matplotlib.pylab as plt

How to do it...

For this recipe, we will be using the bilateralFilter() function from OpenCV-Python. We need to start by downsampling the image to create an image pyramid (you will see more of this in the next chapter), followed by repeated application of small bilateral filters (to remove unimportant details) and upsampling the image to its original size. Next, you need to apply the median blur (to flatten the texture) followed by masking the original image with the binary image obtained by adaptive thresholding. The following code demonstrates the steps:

  1. Read the input image and initialize the parameters to be used later:
img = plt.imread("images/bean.png")

num_down = 2 # number of downsampling steps
num_bilateral = 7 # number of bilateral filtering steps

w, h, _ = img.shape
  1. Use the Gaussian pyramid's downsampling to reduce the image size (and make the subsequent operations faster):
img_color = np.copy(img)
for _ in range(num_down):
img_color = cv2.pyrDown(img_color)
  1. Apply bilateral filters (with a small diameter value) iteratively. The d parameter represents the diameter of the neighborhood for each pixel, where the sigmaColor and sigmaSpace parameters represent the filter sigma in the color and the coordinate spaces, respectively:
for _ in range(num_bilateral):
img_color = cv2.bilateralFilter(img_color, d=9, sigmaColor=0.1, sigmaSpace=0.01)

  1. Use upsampling to enlarge the image to the original size:
for _ in range(num_down):
img_color = cv2.pyrUp(img_color)
  1. Convert to the output image obtained from the last step and blur the image with the median filter:
img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
img_blur = cv2.medianBlur(img_gray, 7)
  1. Detect and enhance the edges:
img_edge = cv2.adaptiveThreshold((255*img_blur).astype(np.uint8), \
255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, \
blockSize=9, C=2)
  1. Convert the grayscale edge image back into an RGB color image and compute bitwise AND with the RGB color image to get the final output cartoonish image:
img_edge = cv2.cvtColor(img_edge, cv2.COLOR_GRAY2RGB)
img_cartoon = cv2.bitwise_and(img_color, img_edge)

How it works...

As explained earlier, the bilateralFilter(), medianBlur(), adaptiveThreshold(), and bitwise_and() functions from OpenCV-Python were the key functions used to first remove weak edges, then convert into flat texture, and finally enhance the prominent edges in the image.

The bilateralFilter() function from OpenCV-Python was used to smooth the textures while keeping the edges fairly sharp:

  • The higher the value of the sigmaColor parameter, the more the pixel colors in the neighborhood will be mixed together. This will produce larger areas of semi-equal color in the output image.
  • The higher the value of the sigmaSpace parameter, the more the pixels with similar colors will influence each other.

The image was downsampled to create an image pyramid (you will see more of this in the next chapter).

Next, repeated application of small bilateral filters was used to remove unimportant details. A subsequent upsampling was used to resize the image to its original size.

Finally, medianBlur was applied (to flatten the texture) followed by masking the original image with the binary image obtained by adaptive thresholding.

If you run the preceding code, you will get an output cartoonish image, as shown here:

There's more...

Play with the parameter values of the OpenCV functions to see the impact on the output image produced. Also, as mentioned earlier, there is more than one way to achieve the same effect. Try using anisotropic diffusion to obtain flat texture images. You should get an image like the following one (use the anisotropic_diffusion() function from the medpy library):

See also

 

Simulating light art/long exposure

Long exposure (or light art) refers to the process of creating a photo that captures the effect of passing time. Some popular application examples of long exposure photographs are silky-smooth water and a single band of continuous-motion illumination of the highways with car headlights. In this recipe, we will simulate the long exposures by averaging the image frames from a video.

Getting ready

We will extract image frames from a video and then average the frames to simulate light art. Let's start by importing the required libraries:

from glob import glob
import cv2
import numpy as np
import matplotlib.pylab as plt

How to do it...

The following steps need to be performed:

  1. Implement an extract_frames() function to extract the first 200 frames (at most) from a video passed as input to the function:
def extract_frames(vid_file):
vidcap = cv2.VideoCapture(vid_file)
success,image = vidcap.read()
i = 1
success = True
while success and i <= 200:
cv2.imwrite('images/exposure/vid_{}.jpg'.format(i), image)
success,image = vidcap.read()
i += 1
  1. Call the function to save all of the frames (as .jpg) extracted from the video of the waterfall in Godafost (Iceland) to the exposure folder:
extract_frames('images/godafost.mp4') #cloud.mp4
  1. Read all the .jpg files from the exposure folder; read them one by one (as float); split each image into B, G, and R channels; compute a running sum of the color channels; and finally, compute average values for the color channels:
imfiles = glob('images/exposure/*.jpg')
nfiles = len(imfiles)
R1, G1, B1 = 0, 0, 0
for i in range(nfiles):
image = cv2.imread(imfiles[i]).astype(float)
(B, G, R) = cv2.split(image)
R1 += R
B1 += B
G1 += G
R1, G1, B1 = R1 / nfiles, G1 / nfiles, B1 / nfiles
  1. Merge the average values of the color channels obtained and save the final output image:
final = cv2.merge([B1, G1, R1])
cv2.imwrite('images/godafost.png', final)

The following photo shows one of the extracted input frames:

If you run the preceding code block, you will obtain a long exposure-like image like the one shown here:

Notice the continuous effects in the clouds and the waterfall.

How it works...

The VideoCapture() function from OpenCV-Python was used to create a VideoCapture object with the video file as input. Then, the read() method of that object was used to capture frames from the video.

The imread() and imwrite() functions from OpenCV-Python were used to read/write images from/to disk.

The cv2.split() function was used to split an RGB image into individual color channels, while the cv2.merge() function was used to combine them back into an RGB image.

There's more...

Focus stacking (also known as extended depth of fields) is a technique (in image processing/computational photography) that takes multiple images (of the same subject but captured at different focus distances) as input and then creates an output image with a higher DOF than any of the individual source images by combining the input images. We can simulate focus stacking in Python. The following is an example of focus stacking grayscale image frames extracted from a video using the mahotas library.

Extended depth of field with mahotas

Perform the following steps to implement focus stacking with the mahotas library functions:

  1. Create the image stack first by extracting grayscale image frames from a highway traffic video at night:
import mahotas as mh
def create_image_stack(vid_file, n = 200):
vidcap = cv2.VideoCapture(vid_file)
success,image = vidcap.read()
i = 0
success = True
h, w = image.shape[:2]
imstack = np.zeros((n, h, w))
while success and i < n:
imstack[i,...] = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
success,image = vidcap.read()
i += 1
return imstack

image = create_image_stack('images/highway.mp4') #cloud.mp4
stack,h,w = image.shape
  1. Use the sobel() function from mahotas as the pixel-level measure of infocusness:
focus = np.array([mh.sobel(t, just_filter=True) for t in image])
  1. At each pixel location, select the best slice (with maximum infocusness) and create the final image:
best = np.argmax(focus, 0)
image = image.reshape((stack,-1)) # image is now (stack, nr_pixels)
image = image.transpose() # image is now (nr_pixels, stack)
final = image[np.arange(len(image)), best.ravel()] # Select the right pixel at each location
final = final.reshape((h,w)) # reshape to get final result

The following photo is an input image used in the image stack:

The following screenshot is the final output image produced by the algorithm implementation:

See also

 

Object detection using color in HSV

In this recipe, you will learn how to detect objects using colors in the HSV color space using OpenCV-Python. You need to specify a range of color values by means of which the object you are interested in will be identified and extracted. You can change the color of the object detected and even make the detected object transparent.

Getting ready

In this recipe, the input image we will use will be an orange fish in an aquarium and the object of interest will be the fish. You will detect the fish, change its color, and make it transparent using the color range of the fish in HSV space. Let's start by importing the required libraries:

import cv2
import numpy as np
import matplotlib.pylab as plt

How to do it...

To do the recipe, the following steps need to be performed:

  1. Read the input and background image. Convert the input image from BGR into the HSV color space:
bck = cv2.imread("images/fish_bg.png")
img = cv2.imread("images/fish.png")
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
  1. Create a mask for the fish by selecting a possible range of HSV colors that the fish can have:
mask = cv2.inRange(hsv, (5, 75, 25), (25, 255, 255))
  1. Slice the orange fish using the mask:
imask = mask>0
orange = np.zeros_like(img, np.uint8)
orange[imask] = img[imask]
  1. Change the color of the orange fish to yellow by changing the hue channel value only (add 20) and converting the image back into the BGR space:
yellow = img.copy()
hsv[...,0] = hsv[...,0] + 20
yellow[imask] = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)[imask]
yellow = np.clip(yellow, 0, 255)

  1. Finally, create the transparent fish image by first extracting the background without the input image with the fish, and then extracting the area corresponding to the foreground object (fish) from the background image and adding these two images:
bckfish = cv2.bitwise_and(bck, bck, mask=imask.astype(np.uint8))
nofish = img.copy()
nofish = cv2.bitwise_and(nofish, nofish, mask=(np.bitwise_not(imask)).astype(np.uint8))
nofish = nofish + bckfish

How it works...

The following screenshot shows an HSV colormap for fast color lookup. The x axis denotes hue, with values in (0,180), the y axis (1) denotes saturation with values in (0,255), and the y axis (2) corresponds to the hue values corresponding to S = 255 and V = 255. To locate a particular color in the colormap, just look up the corresponding H and S range, and then set the range of V as (25, 255). For example, the orange color of the fish we are interested in can be searched in the HSV range from (5, 75, 25) to (25, 255, 255), as observed here:

The inRange() function from OpenCV-Python was used for color detection. It accepts the HSV input image along with the color range (defined previously) as parameters.

cv2.inRange() accepts three parameters—the input image, and the lower and upper limits of the color to be detected, respectively. It returns a binary mask, where white pixels represent the pixels within the range and black pixels represent the one outside the range specified.

To change the color of the fish detected, it is sufficient to change the hue (color) channel value only; we don't need to touch the saturation and value channels.

The bitwise arithmetic with OpenCV-Python was used to extract the foreground/background.

Notice that the background image has slightly different colors from the fish image's background; otherwise, transparent fish would have literally disappeared (invisible cloaking!).

If you run the preceding code snippets and plot all of the images, you will get the following output:

Note that, in OpenCV-Python, an image in the RGB color space is stored in BGR format. If we want to display the image in proper colors, before using imshow() from Matplotlib (which expects the image in RGB format instead), we must convert the image colors with cv2.cvtColor(image, cv2.COLOR_BGR2RGB).

See also

About the Author
  • Sandipan Dey

    Sandipan Dey is a data scientist with a wide range of interests, covering topics such as machine learning, deep learning, image processing, and computer vision. He has worked in numerous data science fields, working with recommender systems, predictive models for the events industry, sensor localization models, sentiment analysis, and device prognostics. He earned his master's degree in computer science from the University of Maryland, Baltimore County, and has published in a few IEEE Data Mining conferences and journals. He has earned certifications from 100+ MOOCs on data science, machine learning, deep learning, image processing, and related courses. He is a regular blogger (sandipanweb) and is a machine learning education enthusiast.

    Browse publications by this author
Latest Reviews (1 reviews total)
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Python Image Processing Cookbook
Unlock this book and the full library FREE for 7 days
Start now