You're reading from Applied Deep Learning and Computer Vision for Self-Driving Cars

Product typeBook

Published inAug 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781838646301

Edition1st Edition

Languages

Python

Tools

TensorFlow Keras

Concepts

Deep Learning

Authors (2):

Sumit Ranjan

Dr. S. Senthamilarasu

View More author details

Computer Vision for Self-Driving Cars

The major players in the autonomous driving industry use a camera as the primary sensor in their vehicle sensor suite. Cameras are rich sensors that capture incredible detail about the environment around the vehicle, but they require extensive processing to make use of the information that's captured. Over the course of this book, you will get hands-on experience of how to algorithmically manipulate camera images to extract information that is useful for autonomous driving.

Of all the common self-driving car sensors, the camera is the sensor that provides the most detailed visual information about objects in the environment. Information about the appearance of the surrounding environment is particularly useful for tasks requiring an understanding of the scene, such as object detection, semantic segmentation, and object identification...

Introduction to computer vision

Computer vision is a science that is used to make computers understand what is happening within an image. Some examples of the use of computer vision in self-driving cars are the detection of other vehicles, lanes, traffic signs, and pedestrians. In simple terms, computer vision helps computers understand images and videos, and determines what the computer is seeing in the surrounding environment.

The following screenshot shows how a human sees the world:

Fig 4.1: Human eye interpretation

In the preceding screenshot, we can see that humans see using their eyes. The visual information captured by their eyes is then interpreted in the brain, enabling the individual to conclude that the object is a bird. Similarly, in computer vision, the camera takes the role of the human eye and the computer takes the role of the brain, as shown in the following screenshot:

Fig 4.2: Computer interpretation

Now the question is, what process actually happens in computer...

Challenges in computer vision

We will look at the following challenges in computer vision:

Viewpoints
Camera limitations
Lighting
Scaling
Object variation

Viewpoints: The first challenge is that of viewpoints. In the following screenshot, both pictures are of roads, but the viewpoints are different. Therefore, it is very difficult to take these images and get our computer to generalize and become smart enough to be able to detect all roads from every viewpoint. This is very simple for humans, but for computers, it is a little more challenging. We will see more challenges like this in Chapter 5, Finding Road Markings Using OpenCV:

Fig 4.3: Viewpoints

Camera limitations: The next challenge is camera limitations. A high-quality camera results in better quality pictures. Since image quality is measured in pixels, we can say that the higher the number of pixels, the better the camera.

Lighting: Lighting is also a challenge. The following photos show roads in light and...

Artificial eyes versus human eyes

In this section, we will compare the requirements of artificial eyes with human eyes. In the following table, we can see the differences between the requirements of artificial eyes for self-driving cars and the capabilities of human eyes:

Self-Driving Car Requirement	Human Eye
It requires 360-degree coverage around the vehicle.	It has 3D vision for 130 degrees of the field of view, resulting in a blind spot. Humans can turn their heads and bodies to mitigate this.
It must identify 3D objects that are close to and far from the vehicle.	The human eye's high resolution extends only to the central 50 degrees in the field of view. Outside the central zone, perception drops.
It must process real-time data.	In the human eye, the frame is good in the central zone and poor in the peripheries.
It should be able to work well in all lighting and weather conditions.	Human eyes perform well in various lighting conditions,...

Building blocks of an image

In this section, we will learn the fundamentals of how to represent an image in a digital format and how to use images in a better way in the machine learning world for tasks such as image manipulation.

We will start by looking at how humans see color. Let's assume we have a yellow box. The brain can see the color yellow. The light waves that are observed by the human eye are translated into color by the visual cortex of the brain. When we look at a yellow box, the wavelengths of the reflected light determine what color we see. The light waves reflect off the yellow box and hit our eyes with a wavelength of 570 to 580 nanometers (the wavelength of yellow light).

In the next section, we will read about the digital representation of images. We are going to use the OpenCV library to process an image.

Digital representation of an image

Now we will see how to represent an image digitally. We will start with grayscale images. A grayscale picture is one where shades of gray are the only colors in the image. The grayscale image is a simple form of the image, and so is easy to process with multiple applications. These are also known as black and white images. Let's look at an image of a car. This image is stored digitally in the form of pixels:

Fig 4.6: Grayscale image

Each pixel has a number in it that ranges from 0 to 255. If a pixel's value is zero, that means the color is black. If its value is 255, it will be white. As this number increases, so does the pixel's brightness. In the following screenshot, we can see that the black pixels contain the number 0 and the white pixels contain the number 255. We can see pixels that are gray too, which occurs between the numbers 0 and 255. This is essentially how we represent the image in a decimal...

Converting images from RGB to grayscale

In this section, we will use a powerful image-processing library called OpenCV and use it to convert an image to grayscale. We will take a color image of a road, as shown in the following screenshot:

Fig 4.10: Sample image

In the following steps, we will convert the color image into grayscale using the OpenCV library:

First, import the matplotlib (mpimg and pyplot), numpy, and openCV libraries:

In[1]: import matplotlib.image as mpimg
In[2]: import matplotlib.pyplot as plt
In[3]: import numpy as np
In[4]: import cv2

Next, import the image for the operation:

In[5]: image_color = mpimg.imread('image.jpg')
In[6]: plt.imshow(image_color)

Let's see what our image looks like:

Fig 4.11: Reading an image using matplotlib

In the preceding screenshot, the image has three channels because it is in an RGB format. Let's check the size of the image. We can see that the value is (515, 763, 3):

In [7]: image_color...

Road-marking detection

In this section, we are going to perform image manipulation by highlighting the white sections of the road markings within a grayscale image and a color image. We will start by detecting these sections in the grayscale image.

Detection with the grayscale image

We will start by using OpenCV techniques with the grayscale image:

Start by importing the matplotlib (mpimg and pyplot), numpy, and openCV libraries as follows:

In[1]: import matplotlib.image as mpimg
In[2]: import matplotlib.pyplot as plt
In[3]: import numpy as np
In[4]: import cv2

Next, read the image and convert it into a grayscale image:

In[5]: image_color = mpimg.imread('Image_4.12.jpg')
In[6]: image_gray = cv2.cvtColor(image_color, cv2.COLOR_BGR2GRAY)
In[7]: plt.imshow(image_gray, cmap = 'gray')

We have already seen what the image looks like; it is the grayscale conversion of the color image:

Fig 4.13: Color image to grayscale

Now we will check the shape of the image, which is (515, 763):

In[8]: image_gray.shape
Out[8]: (515, 763)

Now we will apply a filter to identify the white pixels of the image:

In[9]: image_copy = np.copy(image_gray)

# any value that is not white colour
In[10]: image_copy[ (image_copy...

Detection with the RGB image

Now we will find the road markings in an RGB image:

First, import the matplotlib (mpimg and pyplot), numpy, and openCV libraries:

In[1]: import matplotlib.image as mpimg
In[2]: import matplotlib.pyplot as plt
In[3]: import numpy as np
In[4]: import cv2

Read the image as follows:

In[5]: image_color = mpimg.imread('image.jpg')

We have read the image, and this is what it looks like:

Fig 4.15: Sample image

Now we will check the shape of the image, which is (280, 660, 3):

In[6]: image_color.shape
Out[6]: (280, 660, 3)

Next, we will detect the white lines. We can play with the values in line 8 in the following code to get sharper images. We can set the values of channel-1 to less than 209, *channel 2 to less than 200, and *channel 3 to less than 200. Different images require different values to get sharper images. We will look at the following code:

In[7]: image_copy = np.copy(image_color)

# Any value that is not white
In[8...

Challenges in color selection techniques

In the previous section, we learned how to extract a specific color from a grayscale and color image, and we also identified road marking pixels. But there are a few challenges that might arise when using these techniques. What if the road markings aren't white? What if it's night time, or the weather is different? These are the challenges that we face when programming self-driving cars.

One of the main challenges is the color-selection techniques. Here, we are required to develop a sophisticated algorithm that will work in all conditions, whether it is night time or snowing. There are, however, ways to overcome this challenge:

We can use advanced computer vision techniques to extract more features from images, such as edge detection, which we will cover later in this chapter.
We can use LIDAR to create a high-resolution 3D digital map of the SDC's surroundings. During ideal weather conditions, the LIDAR collects 2.8 million...

Color space techniques

In this section, we are going to firstly explore different color spaces, which are very important in image analysis for self-driving cars. We will explore the following:

RGB color space
HSV color space

The red green blue (RGB) color space describes colors in terms of red, green, and blue, whereas the hue saturation value (HSV) describes colors in terms of hue, saturation, and value.

The HSV color space is preferable over RGB color space when performing image analytics because it describes colors in a way that is closer to our own perception of color, accounting for vibrancy and brightness rather than simply a combination of primary colors.

In the next section, we will learn about the RGB color space.

Introducing the RGB space

We will start with the most popular color space, RGB. As we know, RGB is made up of the colors red, green, and blue. We can mix them up to produce any color:

Fig 4.17: RGB color space

You can check the image and the license at https://commons.wikimedia.org/wiki/File:RGB_Cube_Show_lowgamma_cutout_b.png#/media/File:RGB_Cube_Show_lowgamma_cutout_a.

In the OpenCV library, colors are stored in BGR format, not in RGB format. So when we load the images using OpenCV, it will be reversed, starting with blue, moving to green, and ending with red.

The RGB color table for various colors is shown in the following screenshot:

Fig 4.18: RGB color table

In the next section, we are going to learn about the HSV color space in detail.

HSV space

HSV stands for hue, saturation, and value (or brightness). The HSV color space can be seen in the following screenshot:

Fig 4.19: HSV color space

You can check the image and the license at https://en.wikipedia.org/wiki/HSL_and_HSV#/media/File:HSV_color_solid_cylinder_saturation_gray.png. In HSV, the color space stores the information in cylindrical format, as can be seen in the preceding screenshot.

The values of HSV are as follows:

Hue: Color value (0–360)
Saturation: Vibrancy of color (0–255)
Value: Brightness or intensity (0–255)

Why should we use HSV color space? The HSV color model is preferred by various designers as HSV has a better representation of color than the RGB color space, which is useful when selecting color or ink. It is easy for people to relate to the colors using the HSV model as images can be seen using the three parameters of color, shade, and brightness.

We can specify the color on the basis of the angle...

Color space manipulation

In this section, we will learn how to manually convert RGB to HSV and RGB to grayscale in an image using the OpenCV computer vision library.

Some examples of the conversion from RGB to HSV can be seen in the following screenshot:

Fig 4.20: RGB to HSV conversion

In the preceding diagram, we can see how the values of the image formats are different in RGB and HSV. For example, red is represented as (255,0,0) in the RGB format and as (0,100,100) in the HSV format.

Next, we will convert RGB to HSV using Python:

We are going to use the matplotlib (pyplot and mpimg), numpy, and openCV libraries, which can be imported as follows:

In[1]: import matplotlib.image as mpimg
In[2]: import matplotlib.pyplot as plt
In[3]: import numpy as np
In[4]: import cv2

Then we will read and display the image using OpenCV:

In[5]:image = cv2.imread('Test_image.jpg')

Now print and check the dimensions of the image. Because it is a color image, it will...

Introduction to convolution

Convolutions are used to scan an image and apply a filter to obtain a certain feature using a kernel matrix. An image kernel is a matrix that is used to apply effects such as blurring and sharpening. Kernels are used in machine learning for feature extraction—that is, selecting the most important pixels of an image. It also preserves the spatial relationship between pixels.

In the following screenshot, we can see that after applying kernels, the example image is transformed into feature maps:

Fig 4.32: Applying kernels

In Fig 4.33, we can see how the convolution works. We have an example of a grayscale image, the blue box is the kernel, and the green box is the final image. In general, the kernel is applied to the entire image and scans the features of the image. Convolution can be used when generating a new image, scaling down the image, blurring the image, or sharpening the image, depending on the value of the kernel we use...

Sharpening and blurring

We use different types of kernels for sharpening and blurring images. The kernel for sharpening (the sharpen kernel) highlights the differences in adjacent pixel values, which emphasizes detail by enhancing contrast.

We will look at different examples of sharpening by multiplying the image pixels by 9 or 5 kernels and the other pixels around them by -1 or 0, as shown in the following matrix. The sharpening kernel is simply a way of enhancing the pixel of the image at any point.

Sharpening kernel type 1:

Sharpening kernel type 2:

Next, we will look at blurring kernels.

A blurring kernel is used to blur an image by averaging each pixel value and its neighbors. The blurring kernel is an N x N matrix filled with ones. Normalization has to be performed to achieve blurring. The values in the matrix have to collectively total to 1. If the sum doesn't add up to 1, then the image will be brighter or darker, as shown in Fig 4.36...

Edge detection and gradient calculation

Edge detection is a very important feature-extraction technique in computer vision that is used in self-driving cars to go beyond convolution, which we discussed in the previous section. In the previous section, we learned how to extract edges within an image. We converted a color image to grayscale or HSV, and later applied convolution to an image to extract features from it. In this section, we will learn about edge detection and gradient calculation.

Edge detection is a computer-vision feature-extraction tool that is used to detect the sharp changes in an image.

Let's say that we have three pixels. The first pixel is white, which is represented by 255 (as we have already learned in a previous section of this chapter); the next pixel is 0, which represents black; and the third pixel is also 255. So this means that we are going from white to black and then back to white. Edge detection happens when pixels change...

Introducing Sobel

The gradient-based method based on the first-order derivatives is called the Sobel edge detector. The Sobel edge detector calculates the first-order derivatives of the image separately for the x axis and y axis. Sobel uses two 3 x 3 kernels that convolve over the original image to calculate the derivatives. For image A, Gx and Gy are two images that represent the horizontal and vertical derivative approximations:

The * character indicates the 2D signal processing convolution operation.

The Sobel kernels compute the gradient with smoothing, as it can be decomposed a product of the averaging and differentiation kernels.

Sobel computes the gradient using smoothing. For example, * can be written as follows:

Here, the x-coordinate shows an increase in a right direction, and the y-coordinate shows an increase in a downward direction.

The resulting gradient approximations at each point in the image can be merged...

Introducing the Laplacian edge detector

The Laplacian edge detector uses only one kernel. It calculates second-order derivatives in a single pass and detects zero crossings. In general, the second-order derivative is extremely sensitive to noise.

The kernel for the Laplacian edge detector is shown in the following screenshot:

Fig 4.45: The Laplacian operator

The following is an example of gradient-based edge detection and Laplacian-based edge detection. We can see that the first-order derivative is calculated using gradient-based edge detection, and second-order derivatives are calculated using Laplacian edge detection:

Fig 4.46: Gradient versus Laplacian edge detection

The objective of this book is to introduce you to the different edge detection concepts. If you want to read about these in more detail, you can go to https://en.wikipedia.org/wiki/Edge_detection.

In the next section, we will learn about an important concept called Canny edge detection.

Canny edge detection

The Canny edge is a popular edge-detection algorithm. It can detect a wide range of edges. The Canny edge detection algorithm was developed by John F. Canny in 1986. The Canny edge is widely used in the field of computer vision, as it has a wide range of applications.

The process of Canny edge detection has the following criteria:

The edges of images should be detected with high accuracy.
Only one marks should be created for one image; there should not be any duplicate marks.
The detected edges should be correctly localized on the image.
Granular edges should also be detected.

The Canny edge detection algorithm is applied using the following steps:

In the first step, a Gaussian filter is applied to smooth the image. Smoothing the image removes the noise.
Next, we find the intensity gradient of the image.
Then, we apply nonmaximum suppression to remove any fake edge detection response.
Next, we apply a double-threshold on the image to determine the accuracy...

Image transformation

In this section, we will learn about different image-transformation techniques, such as rotation, translation, resizing, and masking a region of interest. Image transformations are used to correct distorted images or to change the perspective of an image. With regard to self-driving cars, there are lots of applications of image transformation. We have different cameras mounted in the car, and most of the time they are required to transform the image. Sometimes, by transforming the image, we allow the car to concentrate on an area of interest. We will also look at a project on behavioral cloning in Chapter 9, Implementation of Semantic Segmentation.

There are two types of image transformation:

Affine transformation
Projective transformation

Affine transformation

A linear mapping method that preserves points, straight lines, and planes is called affine transformation. After affine transformation, sets of parallel lines will remain parallel. In general, the affine transformation technique is used in the correction of geometric distortions that occur with nonideal camera angles.

An example of affine transformation is as follows. Here, we are applying affine transformation to a rectangle, causing it to shear offsets—that is, a set of points that are moved a distance proportional to their axis:

Fig 4.53: Applying affine transformation to a rectangle

You can learn about affine transformation in more detail at https://en.wikipedia.org/wiki/Affine_transformation.

In the preceding diagram, we can see how a rectangle with four points changes after we apply affine transformation.

Projective transformation

A transformation that maps lines to lines is a projective transformation. Here, parallel lines can be formed with an angle that may intersect at some point. For example, the lines between a - d and d - c on the left side of the following diagram are not parallel and could meet at some point in time. At the same time, the lines between a - d and b - c are parallel and will never meet. After applying the projective transform, none of the sides will have parallel lines:

Fig 4.54: Applying projective transformation

You can learn more about projective transformation at https://en.wikipedia.org/wiki/Homography.

In the next section of the chapter, we will learn about image rotation using the OpenCV library.

Image rotation

In this section, we will learn how we can perform a rotation by using OpenCV and the rotation matrix, M. A rotation matrix is a matrix that is used to perform a rotation in Euclidean space. It rotates points in the xy plane counterclockwise through an angle, 𝜃, around the origin.

Now we will implement image rotation using OpenCV:

We will first import the matplotlib (mpimg and pyplot), numpy, and openCV libraries:

In[1]: import cv2
In[2]: import numpy as np
In[3]: import matplotlib.image as mpimg
In[4]: from matplotlib import pyplot as plt
In[5]: %matplotlib inline

Next, we will read the input image:

In[5]: image = cv2.imread('test_image.jpg')
In[6]: cv2.imshow('Original Image', image)
In[7]: cv2.waitKey()
In[8]: cv2.destroyAllWindows()

The input image looks like this:

Fig 4.55: Input image

The height and width of the image are as follows:

In[9]: height, width = image.shape[:2] 
In[10]: height
579
In[11]: width
530

...

Image translation

In this section, we will learn about image translation. Image translation involves shifting an object's position in the x and/or y direction. OpenCV uses a translational matrix, T, as follows:

Now, we will perform image translation:

We will first import the matplotlib (mpimg and pyplot), numpy, and openCV libraries:

In[1]: import cv2
In[2]: import numpy as np
In[3]: import matplotlib.image as mpimg
In[4]: from matplotlib import pyplot as plt
In[5]: %matplotlib inline

Then we read in the input image:

In[6]: image = cv2.imread('test_image.jpg')
In[7]: cv2.imshow('Original Image', image)
In[8]: cv2.waitKey()
In[9]: cv2.destroyAllWindows()

The input image looks like this:

Fig 4.57: Input image

The height and width of the image are as follows:

In[10]: height, width = image.shape[:2] 
In[11]: height
579
In[12]: width
530

The translation matrix is defined as follows:

In[13]: Translational_Matrix = np.float32([[1, 0, 120], 
        ...

Image resizing

This section is all about image resizing. Resizing using OpenCV can be performed by using cv2.resize(). The preferred interpolation methods are cv.INTER_AREA for shrinking and cv.INTER_CUBIC for zooming. By default, the interpolation method used is cv.INTER_LINEAR for all resizing purposes:

First, we will import the numpy, openCV, and matplotlib (mpimg and pyplot) libraries:

In[1]: import cv2
In[2]: import numpy as np
In[3]: import matplotlib.image as mpimg
In[4]: from matplotlib import pyplot as plt
In[5]: %matplotlib inline

Then we read in the input image:

In[6]: image = cv2.imread('test_image.jpg')
In[7]: cv2.imshow('Original Image', image)
In[8]: cv2.waitKey()
In[9]: cv2.destroyAllWindows()

The input image looks like this:

Fig 4.59: Input image

The height and width of the image are as follows:

In[10]: height, width = image.shape[:2] 
In[11]: height
579
In[12]: width
530

Next, we perform the resize using OpenCV:

In...

Perspective transformation

Perspective transformation is an important aspect of programming self-driving cars. Perspective transformation is more complicated than affine transformation. In perspective transformation, we use a 3 x 3 transformation matrix to transform images from the 3D world into 2D images.

An example of perspective transformation is shown in Fig 4.61. In the following screenshot, we can see the tilted chessboard, and once the perspective transform is applied, the board is transformed into a normal chessboard with a top-down view. This has numerous applications in the field of self-driving cars, as roads have many objects that require perspective transformation to be processed:

Fig 4.61: Perspective transform using a chessboard

Now we will implement perspective transformation to a traffic signboard using the OpenCV library:

We will first import the matplotlib (mpimg and pyplot), numpy, and openCV libraries:

In[1]: import cv2
In[2]: import numpy as...

Cropping, dilating, and eroding an image

Image cropping is a type of image transformation. In this section, we will crop an image using OpenCV:

Firstly, we will import the matplotlib (mpimg and pyplot), numpy, and openCV libraries:

In[1]: import cv2
In[2]: import numpy as np
In[3]: import matplotlib.image as mpimg
In[4]: from matplotlib import pyplot as plt
In[5]: %matplotlib inline

Next, we will read the input image:

In[6]: image = cv2.imread('Test_auto_image.jpg')
In[7]: cv2.imshow('Original Image', image)
In[8]: cv2.waitKey()
In[9]: cv2.destroyAllWindows()

The input image is as follows:

Fig 4.66: Input image

The height and width of the image are as follows:

In[10]: height, width = image.shape[:2]
In[11]: height
800
In[12]: width
1200

Next, we perform cropping with the following code. The top-left coordinates of the desired cropped area are w0 and h0:


In[13]: w0 = int(width * 0.5)
In[14]: h0 = int(height * 0.5)

The bottom-right coordinates...

Masking regions of interest

The main goal of masking regions of interest is to color filter an image to perform different operations. For instance, when an autonomous car is driving on the road, the region of interest for the car is the lane lines because it must be driven on the road. In the following steps, we will see an example of how to determine a region of interest when applied to the lane lines on a road:

Firstly, import the matplotlib (mpimg and pyplot), numpy, and openCV libraries:

In[1]: import cv2
In[2]: import numpy as np
In[3]: import matplotlib.image as mpimg
In[4]: from matplotlib import pyplot as plt
In[5]: %matplotlib inline

Next, we will read the input image:

In[6]: image_color = cv2.imread('lanes.jpg')
In[7]: cv2.imshow('Original Image', image_color)
In[8]: cv2.waitKey()
In[9]: cv2.destroyAllWindows()

The input image looks like this:

Fig 4.71: Input image

The height and width of the image are as follows:

In[10]: height...

The Hough transform

The Hough transform is one of the most important topics of computer vision. It is used in feature extraction and image analysis. The Hough transform was invented in 1972 by Richard Duda and Peter Hart, and it was originally called the generalized Hough transform. In general, the technique is used to find instances of objects that are not perfectly within a certain class by means of a voting procedure.

We can use the Hough transform along with region of interest masking. We will see an example of the detection of road markings in Chapter 5, Finding Road Markings Using OpenCV, using the Hough transform and region of interest masking together.

We will learn about the Hough transform in more detail with the drawing of a 2D coordinate space of x and y inside a straight line, as shown in Fig 4.74.

We know that the equation of a straight line is . The straight line has two parameters, m and c, and we are currently plotting it...

Summary

In this chapter, we learned about the importance of computer vision and the challenges we face in the field of computer vision. We also learned about color spaces, edge detection, and the different types of image transformation, as well as the many examples of using OpenCV. We are going to use a few of these techniques in later chapters.

We learned about the building blocks of an image and how a computer sees an image. We also learned about the importance of color space techniques, such as convolution. We are going to apply all the techniques that we covered here in future chapters.

In the next chapter, we are going to apply computer-vision techniques and implement a software pipeline for detecting road markings. We will first apply this process to an image and then apply it to a video. In the next chapter, we are going to apply several of the techniques that we covered in this chapter, such as edge detection and Hough transformation.

...

The rest of the chapter is locked

You have been reading a chapter from

Applied Deep Learning and Computer Vision for Self-Driving Cars

Published in: Aug 2020Publisher: PacktISBN-13: 9781838646301

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Authors (2)

Sumit Ranjan

Sumit Ranjan is a silver medalist in his Bachelor of Technology (Electronics and Telecommunication) degree. He is a passionate data scientist who has worked on solving business problems to build an unparalleled customer experience across domains such as, automobile, healthcare, semi-conductor, cloud-virtualization, and insurance. He is experienced in building applied machine learning, computer vision, and deep learning solutions, to meet real-world needs. He was awarded Autonomous Self-Driving Car Scholar by KPIT Technologies. He has also worked on multiple research projects at Mercedes Benz Research and Development. Apart from work, his hobbies are traveling and exploring new places, wildlife photography, and blogging.
Read more about Sumit Ranjan

Dr. S. Senthamilarasu

Dr. S. Senthamilarasu was born and raised in the Coimbatore, Tamil Nadu. He is a technologist, designer, speaker, storyteller, journal reviewer educator, and researcher. He loves to learn new technologies and solves real world problems in the IT industry. He has published various journals and research papers and has presented at various international conferences. His research areas include data mining, image processing, and neural network. He loves reading Tamil novels and involves himself in social activities. He has also received silver medals in international exhibitions for his research products for children with an autism disorder. He currently lives in Bangalore and is working closely with lead clients.
Read more about Dr. S. Senthamilarasu

Other recommended products

Related to this chapter

Computer Vision with Python 3

The field of computer vision involves designing and implementing algorithms to understand images and extract meaningful information from them. This book enables you to build real-world applications using Python and open source image processing libraries.

BookAug 2017206 pages

The Computer Vision Workshop

With The Computer Vision Workshop, you’ll explore the basic and advanced techniques in video and image processing using OpenCV and Python. It is filled with real-world exercises and activities that will make the learning process easy and enjoyable.

BookJul 2020568 pages

Hands-On GPU-Accelerated Computer Vision with OpenCV and CUDA

This book is a guide to explore how accelerating of computer vision applications using GPUs will help you develop algorithms that work on complex image data in real time. It will solve the problems you face while deploying these algorithms on embedded platforms with the help of development boards from NVIDIA such as the Jetson TX1, Jetson TX2, and Jetson TK1.

BookSep 2018380 pages

Hands-On Algorithms for Computer Vision

The field of Computer Vision has seen advancements in terms of processing power and performance. Many algorithms are introduced to perform Computer Vision tasks efficiently. This book is a starting point for anyone interested in this field and wants to dig deeper into the most practical algorithms used by professional Computer Vision developers.

BookJul 2018290 pages

Machine Learning for Healthcare Analytics Projects

Machine Learning in the healthcare domain is booming because of its abilities to provide accurate and stabilized techniques. This book is packed with new methodologies to create efficient solutions for healthcare analytics. We will build five end-to-end projects to evaluate the efficiency of AI apps to carry out simple-to-complex healthcare analytics tasks.

BookOct 2018134 pages

Python Image Processing Cookbook

Advancements in wireless devices and mobile technology have enabled the acquisition of a tremendous amount of graphics, pictures, and videos. Through cutting edge recipes, this book provides coverage on tools, algorithms, and analysis for image processing. This book provides solutions addressing the challenges and complex tasks of image processing.

BookApr 2020438 pages

OpenCV 3.x with Python By Example

Computer vision is found everywhere in modern technology. OpenCV for Python enables us to run computer vision algorithms in real time. With the advent of powerful machines, we have more processing power to work with. Using this technology, we can seamlessly integrate our computer vision applications into the cloud. Focusing on OpenCV 3.x and Python 3.6, this book will walk you through all the building blocks needed to build amazing computer vision applications with ease.

BookJan 2018268 pages

R Deep Learning Projects

R is a popular programming language used by statisticians and mathematicians for statistical analysis, and is popularly used for deep learning. This book demonstrates end-to-end implementations of five real-world projects on popular topics in deep learning such as handwritten digit recognition, traffic light detection, fraud detection, text generation, and sentiment analysis. You'll see how to train effective neural networks in R—including convolutional neural networks, recurrent neural networks and LSTMs—and also see how neural networks can be trained using GPU capabilities. You will use popular R libraries and packages—such as MXNetR, H2O, deepnet, and more—to implement the projects. By the end of this book, you will have a better understanding of deep learning concepts and techniques and how to use them in a practical setting.

BookFeb 2018258 pages

Raspberry Pi Computer Vision Programming

You will learn the basics of hardware and software required for image processing and computer vision with Raspberry Pi and Python 3. You will have a look at all the major image processing, manipulation, and computer vision techniques and algorithms in detail using engaging examples. You will build a lot of real-life computer vision applications.

BookJun 2020306 pages5

Ensemble Machine Learning Cookbook

This book uses a recipe-based approach to showcase the power of machine learning algorithms to build ensemble models using Python libraries. Through this book, you will be able to pick up the code, understand in depth how it works, execute and implement it efficiently. This will be a desk reference to implement a wide range of tasks and solve the common and uncommon problems in ensemble machine learning domain.

BookJan 2019336 pages

Hands-On Image Processing with Python

This book covers how to use the image processing libraries in Python. It will enable you to write code snippets to implement complex image processing algorithms such as image enhancement, filtering, segmentation, object detection, and more. You will also be able to use machine learning and deep learning models and learn to implement them with ease.

BookNov 2018492 pages

OpenCV 3 Computer Vision with Python Cookbook

OpenCV 3 is a native cross-platform library for computer vision, machine learning, and image processing. OpenCV's convenient high-level APIs hide very powerful internals designed for computational efficiency that can take advantage of multicore and GPU processing. This book will help you tackle increasingly challenging computer vision problems by providing a number of recipes that you can use to improve your applications.

BookMar 2018306 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages