You're reading from Applied Deep Learning and Computer Vision for Self-Driving Cars

Product type Book

Published in Aug 2020

Publisher Packt

ISBN-13 9781838646301

Pages 332 pages

Edition 1st Edition

Languages

Python

Concepts

Deep Learning

Authors (2):

Sumit Ranjan

Dr. S. Senthamilarasu

View More author details

Implementing Semantic Segmentation

Deep learning has provided great accuracy in the field of computer vision, particularly for object detection. In the past, segmenting images was done by partitioning images into grab cuts, superpixels, and graph cuts. The main problem with the traditional process was that the algorithm was unable to recognize parts of the images.

On the other hand, semantic segmentation algorithms aim to divide the image into relevant categories. They associate every pixel in an input image with a class label: person, tree, street, road, car, bus, and so on. Semantic segmentation algorithms are dynamic and have many use cases, including self-driving cars (SDCs).

In this chapter, you will learn how to perform semantic segmentation using OpenCV, deep learning, and the ENet architecture. As you read this chapter...

Semantic segmentation in images

In this section, we are going to implement one project on semantic segmentation using a popular network called ENet.

Efficient Neural Network (ENet) is one of the more popular networks out there due to its ability to perform real-time, pixel-wise semantic segmentation. ENet is up to 18x faster, requires 75x fewer FLOPs, and has 79x fewer parameters than other networks. This means ENet provides better accuracy than the existing models, such as U-Net and SegNet. ENet networks are typically tested on CamVid, CityScapes, and SUN datasets. The model's size is 3.2 MB.

The model we are using has been trained on 20 classes:

Road
Sidewalk
Building
Wall
Fence
Pole
TrafficLight
TrafficSign
Vegetation
Terrain
Sky
Person
Rider
Car
Truck
Bus
Train
Motorcycle
Bicycle
Unlabeled

We will start with the semantic segmentation project:

First, we will import the necessary packages and libraries, such as numpy, openCV, and...

Semantic segmentation in videos

In this section, we are going to write a software pipeline using the OpenCV and ENet models to perform semantic segmentation on videos. Let's get started:

Import the necessary packages, such as numpy, imutils, and openCV:

import os
import time
import cv2
import imutils
import numpy as np


DEFAULT_FRAME = 1
WIDTH = 600

Then, load the class label names:

class_labels = open('./enet-cityscapes/enet-classes.txt').read().strip().split("\n")

We can load the files from disk if we are supplied with the color file; otherwise, we will need to create the RGB colors for each class:

if os.path.isfile('./enet-cityscapes/enet-colors.txt'):
    CV_ENET_SHAPE_IMG_COLORS = open('./enet-cityscapes/enet-colors.txt').read().strip().split("\n")
    CV_ENET_SHAPE_IMG_COLORS = [np.array(c.split(",")).astype("int") for c in CV_ENET_SHAPE_IMG_COLORS]
    CV_ENET_SHAPE_IMG_COLORS = np.array(CV_ENET_SHAPE_IMG_COLORS...

Summary

In this chapter, we learned how to apply semantic segmentation using OpenCV, deep learning, and the ENet architecture. We used the pretrained ENet model on the Cityscapes dataset and performed semantic segmentation for both images and video streams. There were 20 classes in the context of SDCs and road scene segmentation, including vehicles, pedestrians, and buildings. We implemented and performed semantic segmentation on an image and a video. We saw that the performance of ENet is good for both videos and images. This will be one of the great contributions to making SDCs a reality as it helps them detect different types of objects in real time and ensures the car knows exactly where to drive.

In the next chapter, we are going to implement an interesting project called behavioral cloning. In this project, we are going to apply all the computer vision and deep learning knowledge we have gained from the previous chapters.

...