Reader small image

You're reading from  Hands-On Vision and Behavior for Self-Driving Cars

Product typeBook
Published inOct 2020
PublisherPackt
ISBN-139781800203587
Edition1st Edition
Tools
Right arrow
Authors (2):
Luca Venturi
Luca Venturi
author image
Luca Venturi

Luca Venturi has extensive experience as a programmer with world-class companies, including Ferrari and Opera Software. He has also worked for some start-ups, including Activetainment (maker of the world's first smart bike), Futurehome (a provider of smart home solutions), and CompanyBook (whose offerings apply artificial intelligence to sales). He worked on the Data Platform team at Tapad (Telenor Group), making petabytes of data accessible to the rest of the company, and is now the lead engineer of Piano Software's analytical database.
Read more about Luca Venturi

Krishtof Korda
Krishtof Korda
author image
Krishtof Korda

Krishtof Korda grew up in a mountainside home over which the US Navy's Blue Angels flew during the Reno Air Races each year. A graduate from the University of Southern California and the USMC Officer Candidate School, he set the Marine Corps obstacle course record of 51 seconds. He took his love of aviation to the USAF, flying aboard the C-5M Super Galaxy as a flight test engineer for 5 years, and engineered installations of airborne experiments for the USAF Test Pilot School for 4 years. Later, he transitioned to designing sensor integrations for autonomous cars at Lyft Level 5. Now he works as an applications engineer for Ouster, integrating LIDAR sensors in the fields of robotics, AVs, drones, and mining, and loves racing Enduro mountain bikes.
Read more about Krishtof Korda

View More author details
Right arrow

Chapter 9: Semantic Segmentation

This is probably the most advanced chapter concerning deep learning, as we will go as far as classifying an image at a pixel level with a technique called semantic segmentation. We will use plenty of what we have learned so far, including data augmentation with generators.

We will study a very flexible and efficient neural network architecture called DenseNet in great detail, as well as its extension for semantic segmentation, FC-DenseNet, and then we will write it from scratch and train it with a dataset built with Carla.

I hope you will find this chapter inspiring and challenging. And be prepared for a long training session because our task can be quite demanding!

In this chapter, we will cover the following topics:

  • Introducing semantic segmentation
  • Understanding DenseNet for classification
  • Semantic segmentation with CNN
  • Adapting DenseNet for semantic segmentation
  • Coding the blocks of FC-DenseNet
  • Improving bad...

Technical requirements

To be able to use the code explained in this chapter, you will need to have the following tools and modules installed:

  • The Carla simulator
  • Python 3.7
  • The NumPy module
  • The TensorFlow module
  • The Keras module
  • The OpenCV-Python module
  • A GPU (recommended)

The code for this chapter can be found at https://github.com/PacktPublishing/Hands-On-Computer-Vision-for-Self-Driving-Cars.

The Code in Action videos for this chapter can be found here:

https://bit.ly/3jquo3v

Introducing semantic segmentation

In the previous chapters, we implemented several classifiers, where we provided an image as input and the network said what it was. This can be excellent in many situations, but to be very useful, it usually needs to be combined with a method that can identify the region of interest. We did this in Chapter 7, Detecting Pedestrians and Traffic Lights, where we used SSD to identify a region of interest with a traffic light and then our neural network was able to tell the color. But even this would not be very useful to us, because the regions of interest produced by SSD are rectangles, and therefore a network telling us that there is a road basically as big as the image would not provide much information: is the road straight? Is there a turn? We cannot know. We need more precision.

If object detectors such as SSD brought classification to the next level, now we need to reach the level after that, and maybe more. In fact, we want to classify every...

Understanding DenseNet for classification

DenseNet is a fascinating architecture of neural networks that is designed to be flexible, memory efficient, effective, and also relatively simple. There are really a lot of things to like about DenseNet.

The DenseNet architecture is designed to build very deep networks, solving the problem of the vanishing gradient with techniques derived from ResNet. Our implementation will reach 50 layers, but you can easily build a deeper network. In fact, Keras has three types of DenseNet trained on ImageNet, with 121, 169, and 201 layers, respectively. DenseNet also solves the problem of dead neurons, when you have neurons that are basically not active.The next section will show a high-level overview of DenseNet.

DenseNet from a bird's-eye view

For the moment, we will focus on DenseNet as a classifier, which is not what we are going to implement, but it is useful as a concept to start to understand it. The high-level architecture of DenseNet...

Segmenting images with CNN

A typical semantic segmentation task receives as input an RGB image and needs to output an image with the raw segmentation, but this solution could be problematic. We already know that classifiers generate their results using one-hot encoded labels, and we can do the same for semantic segmentation: instead of generating a single image with the raw segmentation, the network can create a series of one-hot encoded images. In our case, as we need 13 classes, the network will output 13 RGB images, one per label, with the following features:

  • One image describes only one label.
  • The pixels belonging to the label have a value of 1 in the red channel, while all the other pixels are marked as 0.

Each given pixel can be 1 only in one image; it will be 0 in all the remaining images. This is a difficult task, but it does not necessarily require particular architectures: a series of convolutional layers with same padding can do it; however, their cost...

Adapting DenseNet for semantic segmentation

DenseNet is very suitable for semantic segmentation because of its efficiency, accuracy, and abundance of skip layers. In fact, using DenseNet for semantic segmentation proves to be effective even when the dataset is limited and when a label is underrepresented.

To use DenseNet for semantic segmentation, we need to be able to build the right side of the U network, which means that we need the following:

  • A way to increase the resolution; if we call the transition layers of DenseNet transition down, then we need transition-up layers.
  • We need to build the skip layers to join the left and right side of the U network.

Our reference network is FC-DenseNet, also known as one hundred layers tiramisu, but we are not trying to reach 100 layers.

In practice, we want to achieve an architecture similar to the following:

Figure 9.8 – Example of FC-DenseNet architecture

Figure 9.8 – Example of FC-DenseNet architecture

The horizontal red arrows...

Coding the blocks of FC-DenseNet

DenseNet is very flexible, so you can easily configure it in many ways. However, depending on the hardware of your computer, you might hit the limits of your GPU. The following are the values that I used on my computer, but feel free to change them to achieve better accuracy or to reduce the memory consumption or the time required to train the network:

  • Input and output resolution: 160 X 160
  • Growth rate (number of channels added by each convolutional layer in a dense block): 12
  • Number of dense blocks: 11: 5 down, 1 to transition between down and up, and 5 up
  • Number of convolutional blocks in each dense block: 4
  • Batch size: 4
  • Bottleneck layer in the dense blocks: No
  • Compression factor: 0.6
  • Dropout: Yes, 0.2

    We will define some functions that you can use to build FC-DenseNet and, as usual, you are invited to check out the full code on GitHub.

    The first function just defines a convolution with batch normalization:

    def dn_conv...

Summary

Congratulations! You completed the final chapter on deep learning.

We started this chapter by discussing what semantic segmentation means, then we talked extensively about DenseNet and why it is such a great architecture. We quickly talked about using a stack of convolutional layers to implement semantic segmentation, but we focused on a more efficient way, which is using DenseNet after adapting it to this task. In particular, we developed an architecture similar to FC-DenseNet. We collected a dataset with the ground truth for semantic segmentation, using Carla, and then we trained our neural network on it and saw how it performed and when detecting roads and other objects, such as pedestrians and sidewalks. We even discussed a trick to improve the output of a bad semantic segmentation.

This chapter was quite advanced, and it required a good understanding of all the previous chapters about deep learning. It has been quite a ride, and I think it is fair to say that this...

Questions

After reading this chapter, you will be able to answer the following questions:

  1. What is a distinguished characteristic of DenseNet?
  2. What is the name of the family architecture such as inspired the authors of DenseNet?
  3. What is FC-DenseNet?
  4. Why do we say that FC-DenseNet is U-shaped?
  5. Do you need a fancy architecture like DenseNet to perform semantic segmentation?
  6. If you have a neural network that performs poorly at semantic segmentation, is there a quick fix that you can use sometimes, if you have no other options?
  7. What are skip connections used for in FC-DenseNet and other U-shaped architectures?

Further reading

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Vision and Behavior for Self-Driving Cars
Published in: Oct 2020Publisher: PacktISBN-13: 9781800203587
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (2)

author image
Luca Venturi

Luca Venturi has extensive experience as a programmer with world-class companies, including Ferrari and Opera Software. He has also worked for some start-ups, including Activetainment (maker of the world's first smart bike), Futurehome (a provider of smart home solutions), and CompanyBook (whose offerings apply artificial intelligence to sales). He worked on the Data Platform team at Tapad (Telenor Group), making petabytes of data accessible to the rest of the company, and is now the lead engineer of Piano Software's analytical database.
Read more about Luca Venturi

author image
Krishtof Korda

Krishtof Korda grew up in a mountainside home over which the US Navy's Blue Angels flew during the Reno Air Races each year. A graduate from the University of Southern California and the USMC Officer Candidate School, he set the Marine Corps obstacle course record of 51 seconds. He took his love of aviation to the USAF, flying aboard the C-5M Super Galaxy as a flight test engineer for 5 years, and engineered installations of airborne experiments for the USAF Test Pilot School for 4 years. Later, he transitioned to designing sensor integrations for autonomous cars at Lyft Level 5. Now he works as an applications engineer for Ouster, integrating LIDAR sensors in the fields of robotics, AVs, drones, and mining, and loves racing Enduro mountain bikes.
Read more about Krishtof Korda