Image Manipulation and Transformation
Image transformation is the art of transforming an image. With image transformation and manipulation, we can enhance the appearance of an image. The transformation and manipulation operation can also be used as preprocessing steps for more complex image processing tasks, such as classification or segmentation, which you will get more acquainted with in later chapters. In this chapter, you are going to learn how to use different Python libraries (NumPy, SciPy, scikit-image, OpenCV-Python, Mahotas, and Matplotlib) for image manipulation and transformation. Different recipes will help you to learn how to write Python code to implement color space transformation, geometric transformations, perspective transforms/homography, and so on.
In this chapter, we will cover the following recipes:
- Transforming color space (RGB → Lab)
- Applying affine transformation
- Applying perspective transformation and homography
- Creating pencil sketches from images
- Creating cartoonish images
- Simulating light art/long exposure
- Object detection using color in HSV
Technical requirements
To run the codes without any errors, you need to first install Python 3 (for example, 3.6) and the required libraries, if they are not already installed. If you are working on Windows, you are recommended to install the Anaconda distribution. You also need to install the jupyter library to work with the notebooks.
All of the code files in this book are available in the GitHub repository at https://github.com/PacktPublishing/Python-Image-Processing-Cookbook. You should clone the repository (to your working directory). Corresponding to each chapter, there is a folder and each folder contains a notebook with the complete code (for all of the recipes for each chapter); a subfolder named images, which contains all the input images (and related files) required for that chapter; and (optionally) another sub-folder named models, which contains the models and related files to be used for the recipes in that chapter.
Transforming color space (RGB → Lab)
The CIELAB (abbreviated as Lab) color space consists of three color channels, expressing the color of a pixel as three tuples (L, a, b), where the L channel stands for luminosity/illumination/intensity (lightness). The a and b channels represent the green-red and blue-yellow color components, respectively. This color model separates the intensity from the colors completely. It's device-independent and has a large gamut. In this recipe, you will see how to convert from RGB into the Lab color space and vice versa and the usefulness of this color model.
Getting ready
In this recipe, we will use a flower RGB image as the input image. Let's start by importing the required libraries with the following code block:
import numpy as np
from skimage.io import imread
from skimage.color import rgb2lab, lab2rgb
import matplotlib.pylab as plt
How to do it...
In this recipe, you will see a few remarkable uses of the Lab color space and how it makes some image manipulation operations easy and elegant.
Converting RGB image into grayscale by setting the Lab space color channels to zero
Perform the following steps to convert an RGB color image into a grayscale image using the Lab color space and scikit-image library functions:
- Read the input image. Perform a color space transformation—from RGB to Lab color space:
im = imread('images/flowers.png')
im1 = rgb2lab(im)
- Set the color channel values (the second and third channels) to zeros:
im1[...,1] = im1[...,2] = 0
- Obtain the grayscale image by converting the image back into the RGB color space from the Lab color space:
im1 = lab2rgb(im1)
- Plot the input and output images, as shown in the following code:
plt.figure(figsize=(20,10))
plt.subplot(121), plt.imshow(im), plt.axis('off'), plt.title('Original image', size=20)
plt.subplot(122), plt.imshow(im1), plt.axis('off'), plt.title('Gray scale image', size=20)
plt.show()
The following screenshot shows the output of the preceding code block:

Changing the brightness of the image by varying the luminosity channel
Perform the following steps to change the brightness of a colored image using the Lab color space and scikit-image library functions:
- Convert the input image from RGB into the Lab color space and increase the first channel values (by 50):
im1 = rgb2lab(im)
im1[...,0] = im1[...,0] + 50
- Convert it back into the RGB color space and obtain a brighter image:
im1 = lab2rgb(im1)
- Convert the RGB image into the Lab color space and decrease only the first channel values (by 50, as seen in the following code) and then convert back into the RGB color space to get a darker image instead:
im1 = rgb2lab(im)
im1[...,0] = im1[...,0] - 50
im1 = lab2rgb(im1)
If you run the preceding code and plot the input and output images, you will get an output similar to the one shown in the following screenshot:

How it works...
The rgb2lab() function from the scikit-image color module was used to convert an image from RGB into the Lab color space.
The modified image in the Lab color space was converted back into RGB using the lab2rgb() function from the scikit-image color module.
Since the color channels are separated in the a and b channels and in terms of intensity in the L channel by setting the color channel values to zero, we can obtain the grayscale image from a colored image in the Lab space.
The brightness of the input color image was changed by changing only the L channel values in the Lab space (unlike in the RGB color space where all the channel values need to be changed); there is no need to touch the color channels.
There's more...
There are many other uses of the Lab color space. For example, you can obtain a more natural inverted image in the Lab space since only the luminosity channel needs to be inverted, as demonstrated in the following code block:
im1 = rgb2lab(im)
im1[...,0] = np.max(im1[...,0]) - im1[...,0]
im1 = lab2rgb(im1)
If you run the preceding code and display the input image and the inverted images obtained in the RGB and the Lab space, you will get the following screenshot:

As you can see, the Inverted image in the Lab color space appears much more natural than the Inverted image in the RGB color space.
Applying affine transformation
An affine transformation is a geometric transformation that preserves points, straight lines, and planes. Lines that are parallel before the transform remain parallel post-application of the transform. For every pixel x in an image, the affine transformation can be represented by the mapping, x |→ Mx+b, where M is a linear transform (matrix) and b is an offset vector.
In this recipe, we will use the scipy ndimage library function, affine_transform(), to implement such a transformation on an image.
Getting ready
First, let's import the libraries and the functions required to implement an affine transformation on a grayscale image:
import numpy as np
from scipy import ndimage as ndi
from skimage.io import imread
from skimage.color import rgb2gray
How to do it...
Perform the following steps to apply an affine transformation to an image using the scipy.ndimage module functions:
- Read the color image, convert it into grayscale, and obtain the grayscale image shape:
img = rgb2gray(imread('images/humming.png'))
w, h = img.shape
- Apply identity transform:
mat_identity = np.array([[1,0,0],[0,1,0],[0,0,1]])
img1 = ndi.affine_transform(img, mat_identity)
- Apply reflection transform (along the x axis):
mat_reflect = np.array([[1,0,0],[0,-1,0],[0,0,1]]) @ np.array([[1,0,0],[0,1,-h],[0,0,1]])
img1 = ndi.affine_transform(img, mat_reflect) # offset=(0,h)
- Scale the image (0.75 times along the x axis and 1.25 times along the y axis):
s_x, s_y = 0.75, 1.25
mat_scale = np.array([[s_x,0,0],[0,s_y,0],[0,0,1]])
img1 = ndi.affine_transform(img, mat_scale)
- Rotate the image by 30° counter-clockwise. It's a composite operation—first, you will need to shift/center the image, apply rotation, and then apply inverse shift:
theta = np.pi/6
mat_rotate = np.array([[1,0,w/2],[0,1,h/2],[0,0,1]]) @ np.array([[np.cos(theta),np.sin(theta),0],[np.sin(theta),-np.cos(theta),0],[0,0,1]]) @ np.array([[1,0,-w/2],[0,1,-h/2],[0,0,1]])
img1 = ndi.affine_transform(img1, mat_rotate)
- Apply shear transform to the image:
lambda1 = 0.5
mat_shear = np.array([[1,lambda1,0],[lambda1,1,0],[0,0,1]])
img1 = ndi.affine_transform(img1, mat_shear)
- Finally apply all of the transforms together, in sequence:
mat_all = mat_identity @ mat_reflect @ mat_scale @ mat_rotate @ mat_shear
ndi.affine_transform(img, mat_all)
The following screenshot shows the matrices (M) for each of the affine transformation operations:

How it works...
Note that, for an image, the x axis is the vertical (+ve downward) axis and the y axis is the horizontal (+ve left-to-right) axis.
With the affine_transform() function, the pixel value at location o in the output (transformed) image is determined from the pixel value in the input image at position np.dot(matrix, o) + offset. Hence, the matrix that needs to be provided as input to the function is actually the inverse transformation matrix.
In some of the cases, an additional matrix is used for translation, to bring the transformed image within the frame of visualization.
The preceding code snippets show how to implement different affine transformations such as reflection, scaling, rotation, and shear using the affine_transform() function. We need to provide the proper transformation matrix, M (shown in the preceding diagram) for each of these cases (homogeneous coordinates are used).
We can use the product of all of the matrices to perform a combination of all of the affine transformations at once (for instance, if you want transformation T1 followed by T2, you need to multiply the input image by the matrix T2.T1).
If all of the transformations are applied in sequence and the transformed images are plotted one by one, you will obtain an output like the following screenshot:

There's more...
Again, in the previous example, the affine_transform() function was applied to a grayscale image. The same effect can be obtained with a color image also, such as by applying the mapping function to each of the image channels simultaneously and independently. Also, the scikit-image library provides the AffineTransform and PiecewiseAffineTransform classes; you may want to try them to implement affine transformation as well.
Applying perspective transformation and homography
The goal of perspective (projective) transform is to estimate homography (a matrix, H) from point correspondences between two images. Since the matrix has a Depth Of Field (DOF) of eight, you need at least four pairs of points to compute the homography matrix from two images. The following diagram shows the basic concepts required to compute the homography matrix:

Fortunately, we don't need to compute the SVD and the H matrix is computed automatically by the ProjectiveTransform function from the scikit-image transform module. In this recipe, we will use this function to implement homography.
Getting ready
We will use a humming bird's image and an image of an astronaut on the moon (taken from NASA's public domain images) as input images in this recipe. Again, let's start by importing the required libraries as usual:
from skimage.transform import ProjectiveTransform
from skimage.io import imread
import numpy as np
import matplotlib.pylab as plt
How to do it...
Perform the following steps to apply a projective transformation to an image using the transform module from scikit-image:
- First, read the source image and create a destination image with the np.zeros() function:
im_src = (imread('images/humming2.png'))
height, width, dim = im_src.shape
im_dst = np.zeros((height, width, dim))
- Create an instance of the ProjectiveTransform class:
pt = ProjectiveTransform()
- You just need to provide four pairs of matching points between the source and destination images to estimate the homography matrix, H, automatically for you. Here, the four corners of the destination image and the four corners of the input hummingbird are provided as matching points, as shown in the following code block:
src = np.array([[ 295., 174.],
[ 540., 146. ],
[ 400., 777.],
[ 60., 422.]])
dst = np.array([[ 0., 0.],
[height-1, 0.],
[height-1, width-1],
[ 0., width-1]])
- Obtain the source pixel index corresponding to each pixel index in the destination:
x, y = np.mgrid[:height, :width]
dst_indices = np.hstack((x.reshape(-1, 1), y.reshape(-1,1)))
src_indices = np.round(pt.inverse(dst_indices), 0).astype(int)
valid_idx = np.where((src_indices[:,0] < height) & (src_indices[:,1] < width) &
(src_indices[:,0] >= 0) & (src_indices[:,1] >= 0))
dst_indicies_valid = dst_indices[valid_idx]
src_indicies_valid = src_indices[valid_idx]
- Copy pixels from the source to the destination images:
im_dst[dst_indicies_valid[:,0],dst_indicies_valid[:,1]] =
im_src[src_indicies_valid[:,0],src_indicies_valid[:,1]]
If you run the preceding code snippets, you will get an output like the following screenshot:

The next screenshot shows the source image of an astronaut on the moon and the destination image of the canvas. Again, by providing four pairs of mapping points in between the source (corner points) and destination (corners of the canvas), the task is pretty straightforward:

The following screenshot shows the output image after the projective transform:

How it works...
In both of the preceding cases, the input image is projected onto the desired location of the output image.
A ProjectiveTransform object is needed to be created first to apply perspective transform to an image.
A set of 4-pixel positions from the source image and corresponding matching pixel positions in the destination image are needed to be passed to the estimate() function along with the object instance and this computes the homography matrix, H (and returns True if it can be computed).
The inverse() function is to be called on the object and this will give you the source pixel indices corresponding to all destination pixel indices.
There's more...
You can use the warp() function (instead of the inverse() function) to implement homography/projective transform.
See also
For more details, refer to the following links:
Creating pencil sketches from images
Producing sketches from images is all about detecting edges in images. In this recipe, you will learn how to use different techniques, including the difference of Gaussian (and its extended version, XDOG), anisotropic diffusion, and dodging (applying Gaussian blur + invert + thresholding), to obtain sketches from images.
Getting ready
The following libraries need to be imported first:
import numpy as np
from skimage.io import imread
from skimage.color import rgb2gray
from skimage import util
from skimage import img_as_float
import matplotlib.pylab as plt
from medpy.filter.smoothing import anisotropic_diffusion
from skimage.filters import gaussian, threshold_otsu
How to do it...
The following steps need to be performed:
- Define the normalize() function to implement min-max normalization in an image:
def normalize(img):
return (img-np.min(img))/(np.max(img)-np.min(img))
- Implement the sketch() function that takes an image and the extracted edges as input:
def sketch(img, edges):
output = np.multiply(img, edges)
output[output>1]=1
output[edges==1]=1
return output
- Implement a function to extract the edges from an image with anisotropic diffusion:
def edges_with_anisotropic_diffusion(img, niter=100, kappa=10, gamma=0.1):
output = img - anisotropic_diffusion(img, niter=niter, \
kappa=kappa, gamma=gamma, voxelspacing=None, \
option=1)
output[output > 0] = 1
output[output < 0] = 0
return output
- Implement a function to extract the edges from an image with the dodge operation (there are two implementations):
def sketch_with_dodge(img):
orig = img
blur = gaussian(util.invert(img), sigma=20)
result = blur / util.invert(orig)
result[result>1] = 1
result[orig==1] = 1
return result
def edges_with_dodge2(img):
img_blurred = gaussian(util.invert(img), sigma=5)
output = np.divide(img, util.invert(img_blurred) + 0.001)
output = normalize(output)
thresh = threshold_otsu(output)
output = output > thresh
return output
- Implement a function to extract the edges from an image with a Difference of Gaussian (DOG) operation:
def edges_with_DOG(img, k = 200, gamma = 1):
sigma = 0.5
output = gaussian(img, sigma=sigma) - gamma*gaussian(img, \
sigma=k*sigma)
output[output > 0] = 1
output[output < 0] = 0
return output
- Implement a function to produce sketches from an image with an Extended Difference of Gaussian (XDOG) operation:
def sketch_with_XDOG(image, epsilon=0.01):
phi = 10
difference = edges_with_DOG(image, 200, 0.98).astype(np.uint8)
for i in range(0, len(difference)):
for j in range(0, len(difference[0])):
if difference[i][j] >= epsilon:
difference[i][j] = 1
else:
ht = np.tanh(phi*(difference[i][j] - epsilon))
difference[i][j] = 1 + ht
difference = normalize(difference)
return difference
If you run the preceding code and plot all of the input/output images, you will obtain an output like the following screenshot:

How it works...
As you can see from the previous section, many of the sketching techniques work by blurring the edges (for example, with Gaussian filter or diffusion) in the image and removing details to some extent and then subtracting the original image to get the sketch outlines.
The gaussian() function from the scikit-image filters module was used to blur the images.
The anisotropic_diffusion() function from the filter.smoothing module of the medpy library was used to find edges with anisotropic diffusion (a variational method).
The dodge operation divides (using np.divide()) the image by the inverted blurred image. This highlights the boldest edges in the image.
There's more...
There are a few more edge detection techniques, such as Canny (with hysteresis thresholds), that you can try to produce sketches from images. You can try them on your own and compare the sketches obtained using different algorithms. Also, by using OpenCV-Python's pencilSketch() and sylization() functions, you can produce black and white and color pencil sketches, as well as watercolor-like stylized images, with the following few lines of code:
import cv2
import matplotlib.pylab as plt
src = cv2.imread('images/bird.png')
#dst = cv2.detailEnhance(src, sigma_s=10, sigma_r=0.15)
dst_sketch, dst_color_sketch = cv2.pencilSketch(src, sigma_s=50, sigma_r=0.05, shade_factor=0.05)
dst_water_color = cv2.stylization(src, sigma_s=50, sigma_r=0.05)
If you run this code and plot the images, you will get a diagram similar to the following screenshot:

See also
For more details, refer to the following link:
Creating cartoonish images
In this recipe, you will learn how to create cartoonish flat-textured images from an image. Again, there is more than one way to do the same; here, we will learn how to do it with edge-preserving bilateral filters.
Getting ready
The following libraries need to be imported first:
import cv2
import numpy as np
import matplotlib.pylab as plt
How to do it...
For this recipe, we will be using the bilateralFilter() function from OpenCV-Python. We need to start by downsampling the image to create an image pyramid (you will see more of this in the next chapter), followed by repeated application of small bilateral filters (to remove unimportant details) and upsampling the image to its original size. Next, you need to apply the median blur (to flatten the texture) followed by masking the original image with the binary image obtained by adaptive thresholding. The following code demonstrates the steps:
- Read the input image and initialize the parameters to be used later:
img = plt.imread("images/bean.png")
num_down = 2 # number of downsampling steps
num_bilateral = 7 # number of bilateral filtering steps
w, h, _ = img.shape
- Use the Gaussian pyramid's downsampling to reduce the image size (and make the subsequent operations faster):
img_color = np.copy(img)
for _ in range(num_down):
img_color = cv2.pyrDown(img_color)
- Apply bilateral filters (with a small diameter value) iteratively. The d parameter represents the diameter of the neighborhood for each pixel, where the sigmaColor and sigmaSpace parameters represent the filter sigma in the color and the coordinate spaces, respectively:
for _ in range(num_bilateral):
img_color = cv2.bilateralFilter(img_color, d=9, sigmaColor=0.1, sigmaSpace=0.01)
- Use upsampling to enlarge the image to the original size:
for _ in range(num_down):
img_color = cv2.pyrUp(img_color)
- Convert to the output image obtained from the last step and blur the image with the median filter:
img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
img_blur = cv2.medianBlur(img_gray, 7)
- Detect and enhance the edges:
img_edge = cv2.adaptiveThreshold((255*img_blur).astype(np.uint8), \
255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, \
blockSize=9, C=2)
- Convert the grayscale edge image back into an RGB color image and compute bitwise AND with the RGB color image to get the final output cartoonish image:
img_edge = cv2.cvtColor(img_edge, cv2.COLOR_GRAY2RGB)
img_cartoon = cv2.bitwise_and(img_color, img_edge)
How it works...
As explained earlier, the bilateralFilter(), medianBlur(), adaptiveThreshold(), and bitwise_and() functions from OpenCV-Python were the key functions used to first remove weak edges, then convert into flat texture, and finally enhance the prominent edges in the image.
The bilateralFilter() function from OpenCV-Python was used to smooth the textures while keeping the edges fairly sharp:
- The higher the value of the sigmaColor parameter, the more the pixel colors in the neighborhood will be mixed together. This will produce larger areas of semi-equal color in the output image.
- The higher the value of the sigmaSpace parameter, the more the pixels with similar colors will influence each other.
The image was downsampled to create an image pyramid (you will see more of this in the next chapter).
Next, repeated application of small bilateral filters was used to remove unimportant details. A subsequent upsampling was used to resize the image to its original size.
Finally, medianBlur was applied (to flatten the texture) followed by masking the original image with the binary image obtained by adaptive thresholding.
If you run the preceding code, you will get an output cartoonish image, as shown here:

There's more...
Play with the parameter values of the OpenCV functions to see the impact on the output image produced. Also, as mentioned earlier, there is more than one way to achieve the same effect. Try using anisotropic diffusion to obtain flat texture images. You should get an image like the following one (use the anisotropic_diffusion() function from the medpy library):

See also
For more details, refer to the following links:
Simulating light art/long exposure
Long exposure (or light art) refers to the process of creating a photo that captures the effect of passing time. Some popular application examples of long exposure photographs are silky-smooth water and a single band of continuous-motion illumination of the highways with car headlights. In this recipe, we will simulate the long exposures by averaging the image frames from a video.
Getting ready
We will extract image frames from a video and then average the frames to simulate light art. Let's start by importing the required libraries:
from glob import glob
import cv2
import numpy as np
import matplotlib.pylab as plt
How to do it...
The following steps need to be performed:
- Implement an extract_frames() function to extract the first 200 frames (at most) from a video passed as input to the function:
def extract_frames(vid_file):
vidcap = cv2.VideoCapture(vid_file)
success,image = vidcap.read()
i = 1
success = True
while success and i <= 200:
cv2.imwrite('images/exposure/vid_{}.jpg'.format(i), image)
success,image = vidcap.read()
i += 1
- Call the function to save all of the frames (as .jpg) extracted from the video of the waterfall in Godafost (Iceland) to the exposure folder:
extract_frames('images/godafost.mp4') #cloud.mp4
- Read all the .jpg files from the exposure folder; read them one by one (as float); split each image into B, G, and R channels; compute a running sum of the color channels; and finally, compute average values for the color channels:
imfiles = glob('images/exposure/*.jpg')
nfiles = len(imfiles)
R1, G1, B1 = 0, 0, 0
for i in range(nfiles):
image = cv2.imread(imfiles[i]).astype(float)
(B, G, R) = cv2.split(image)
R1 += R
B1 += B
G1 += G
R1, G1, B1 = R1 / nfiles, G1 / nfiles, B1 / nfiles
- Merge the average values of the color channels obtained and save the final output image:
final = cv2.merge([B1, G1, R1])
cv2.imwrite('images/godafost.png', final)
The following photo shows one of the extracted input frames:

If you run the preceding code block, you will obtain a long exposure-like image like the one shown here:

Notice the continuous effects in the clouds and the waterfall.
How it works...
The VideoCapture() function from OpenCV-Python was used to create a VideoCapture object with the video file as input. Then, the read() method of that object was used to capture frames from the video.
The imread() and imwrite() functions from OpenCV-Python were used to read/write images from/to disk.
The cv2.split() function was used to split an RGB image into individual color channels, while the cv2.merge() function was used to combine them back into an RGB image.
There's more...
Focus stacking (also known as extended depth of fields) is a technique (in image processing/computational photography) that takes multiple images (of the same subject but captured at different focus distances) as input and then creates an output image with a higher DOF than any of the individual source images by combining the input images. We can simulate focus stacking in Python. The following is an example of focus stacking grayscale image frames extracted from a video using the mahotas library.
Extended depth of field with mahotas
Perform the following steps to implement focus stacking with the mahotas library functions:
- Create the image stack first by extracting grayscale image frames from a highway traffic video at night:
import mahotas as mh
def create_image_stack(vid_file, n = 200):
vidcap = cv2.VideoCapture(vid_file)
success,image = vidcap.read()
i = 0
success = True
h, w = image.shape[:2]
imstack = np.zeros((n, h, w))
while success and i < n:
imstack[i,...] = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
success,image = vidcap.read()
i += 1
return imstack
image = create_image_stack('images/highway.mp4') #cloud.mp4
stack,h,w = image.shape
- Use the sobel() function from mahotas as the pixel-level measure of infocusness:
focus = np.array([mh.sobel(t, just_filter=True) for t in image])
- At each pixel location, select the best slice (with maximum infocusness) and create the final image:
best = np.argmax(focus, 0)
image = image.reshape((stack,-1)) # image is now (stack, nr_pixels)
image = image.transpose() # image is now (nr_pixels, stack)
final = image[np.arange(len(image)), best.ravel()] # Select the right pixel at each location
final = final.reshape((h,w)) # reshape to get final result
The following photo is an input image used in the image stack:

The following screenshot is the final output image produced by the algorithm implementation:

See also
For more details, refer to the following links:
Object detection using color in HSV
In this recipe, you will learn how to detect objects using colors in the HSV color space using OpenCV-Python. You need to specify a range of color values by means of which the object you are interested in will be identified and extracted. You can change the color of the object detected and even make the detected object transparent.
Getting ready
In this recipe, the input image we will use will be an orange fish in an aquarium and the object of interest will be the fish. You will detect the fish, change its color, and make it transparent using the color range of the fish in HSV space. Let's start by importing the required libraries:
import cv2
import numpy as np
import matplotlib.pylab as plt
How to do it...
To do the recipe, the following steps need to be performed:
- Read the input and background image. Convert the input image from BGR into the HSV color space:
bck = cv2.imread("images/fish_bg.png")
img = cv2.imread("images/fish.png")
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
- Create a mask for the fish by selecting a possible range of HSV colors that the fish can have:
mask = cv2.inRange(hsv, (5, 75, 25), (25, 255, 255))
- Slice the orange fish using the mask:
imask = mask>0
orange = np.zeros_like(img, np.uint8)
orange[imask] = img[imask]
- Change the color of the orange fish to yellow by changing the hue channel value only (add 20) and converting the image back into the BGR space:
yellow = img.copy()
hsv[...,0] = hsv[...,0] + 20
yellow[imask] = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)[imask]
yellow = np.clip(yellow, 0, 255)
- Finally, create the transparent fish image by first extracting the background without the input image with the fish, and then extracting the area corresponding to the foreground object (fish) from the background image and adding these two images:
bckfish = cv2.bitwise_and(bck, bck, mask=imask.astype(np.uint8))
nofish = img.copy()
nofish = cv2.bitwise_and(nofish, nofish, mask=(np.bitwise_not(imask)).astype(np.uint8))
nofish = nofish + bckfish
How it works...
The following screenshot shows an HSV colormap for fast color lookup. The x axis denotes hue, with values in (0,180), the y axis (1) denotes saturation with values in (0,255), and the y axis (2) corresponds to the hue values corresponding to S = 255 and V = 255. To locate a particular color in the colormap, just look up the corresponding H and S range, and then set the range of V as (25, 255). For example, the orange color of the fish we are interested in can be searched in the HSV range from (5, 75, 25) to (25, 255, 255), as observed here:

The inRange() function from OpenCV-Python was used for color detection. It accepts the HSV input image along with the color range (defined previously) as parameters.
cv2.inRange() accepts three parameters—the input image, and the lower and upper limits of the color to be detected, respectively. It returns a binary mask, where white pixels represent the pixels within the range and black pixels represent the one outside the range specified.
To change the color of the fish detected, it is sufficient to change the hue (color) channel value only; we don't need to touch the saturation and value channels.
The bitwise arithmetic with OpenCV-Python was used to extract the foreground/background.
Notice that the background image has slightly different colors from the fish image's background; otherwise, transparent fish would have literally disappeared (invisible cloaking!).
If you run the preceding code snippets and plot all of the images, you will get the following output:

Note that, in OpenCV-Python, an image in the RGB color space is stored in BGR format. If we want to display the image in proper colors, before using imshow() from Matplotlib (which expects the image in RGB format instead), we must convert the image colors with cv2.cvtColor(image, cv2.COLOR_BGR2RGB).
See also
For more details, refer to the following links:
- https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_colorspaces/py_colorspaces.html
- https://i.stack.imgur.com/gyuw4.png
- https://stackoverflow.com/questions/10948589/choosing-the-correct-upper-and-lower-hsv-boundaries-for-color-detection-withcv
- https://www.youtube.com/watch?v=lF0aOM3WJ74