Reader small image

You're reading from  Deep Learning for Computer Vision

Product typeBook
Published inJan 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781788295628
Edition1st Edition
Languages
Right arrow
Author (1)
Rajalingappaa Shanmugamani
Rajalingappaa Shanmugamani
author image
Rajalingappaa Shanmugamani

Rajalingappaa Shanmugamani is currently working as an Engineering Manager for a Deep learning team at Kairos. Previously, he worked as a Senior Machine Learning Developer at SAP, Singapore and worked at various startups in developing machine learning products. He has a Masters from Indian Institute of TechnologyMadras. He has published articles in peer-reviewed journals and conferences and submitted applications for several patents in the area of machine learning. In his spare time, he coaches programming and machine learning to school students and engineers.
Read more about Rajalingappaa Shanmugamani

Right arrow

Chapter 10. Deployment

In this chapter, we will learn how to deploy the trained model in the various platforms for maximum throughput and minimum latency. We will understand performance on various hardware such as a GPU and a CPU. We will follow the steps of deploying TensorFlow in platforms such as Amazon Web Services, Google Cloud Platform, and mobile platforms such as Android, iOS, and Tegra.

We will cover the following topics in this chapter:

  • Understanding the factors affecting the performance of the deep learning model training and inference
  • Improving the performance through various methods
  • Seeing the benchmarks of various hardware and learning the steps to tweak them for maximum performance
  • Using various cloud platforms for deployment
  • Using various mobile platforms for deployment

Performance of models


Performance is important for both the training and the deployment of deep learning models. The training usually takes more time due to large data or big model architecture. The resultant models may be a bigger size and hence problematic to use in mobile devices where there is a constraint on RAM. More computing time results in more infrastructure cost. The inference time is critical in video applications. Due to the previously mentioned importance of performance, in this section, we will look at techniques to improve the performance. Reducing the model complexity is an easy option but results in decreasing accuracy. Here, we will focus on methods to improve the performance with an insignificant drop in accuracy. In the next section, we will discuss the option of quantization.  

Quantizing the models

The weights of deep learning models have 32-bit float values. When the weights are quantized to 8-bit, the decrease in accuracy is small and hence cannot be noticed in deployment...

Deployment in the cloud


The models have to be deployed in the cloud for several applications. We will look at major cloud service providers for this purpose.

AWS

The Amazon Web Services (AWS) extends support to the development and deployment of TensorFlow-based models. Sign up for AWS at https://aws.amazon.com/ and select one of the Amazon Machine Images (AMI). AMIs are images of machines with all the required software installed. You need not worry about installing the packages. AWS provides Deep Learning AMI (DLAMI) for ease of training and deploying deep learning models. There are several options to choose from. Here, we will use Conda as it comes with several packages required for running TensorFlow. There are two options for Python: version 2 and 3. The following code will activate TensorFlow with Keras 2 on Python 3 on CUDA 8:

source activate tensorflow_p36

The following code will activate TensorFlow with Keras 2 on Python 2 on CUDA 8:

source activate tensorflow_p27

Note

You can visit https...

Deployment of models in devices


TensorFlow models can be deployed in mobile devices too. Mobile devices include smartphones, drones, home robots and so on. Billions of smartphones can have applications of computer vision which can use deep learning. One can take a photo and search, stream a video with scenes tagged and so on. Deploying in mobile devices means that the deep learning model is present on the device and inference happens on the device. Models deployed on the device helps in privacy issues. In the following topics, we will discuss how to deploy them across various mobile platforms.

Jetson TX2

Jetson TX2 is an embedding device supplied by NVIDIA specifically efficient AI computing. Jetson TX2 is lightweight, compact and hence suitable for deployment in drones, public places and so on. It also ships preinstalled TensorRT which is a runtime for TensorFlow. You can buy Jetson and flash install Ubuntu, CUDA, CUDNN before installing TensorFlow. Clone https://github.com/jetsonhacks/installTensorFlowTX2...

Summary


In this chapter, we have seen how to deploy the trained deep learning models on various platforms and devices. We have covered the steps as well as guidelines on getting the best performance for these platforms. We have seen the advantages of MobileNets for reducing the inference time with a small trade-off of accuracy. 

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Deep Learning for Computer Vision
Published in: Jan 2018Publisher: PacktISBN-13: 9781788295628
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Rajalingappaa Shanmugamani

Rajalingappaa Shanmugamani is currently working as an Engineering Manager for a Deep learning team at Kairos. Previously, he worked as a Senior Machine Learning Developer at SAP, Singapore and worked at various startups in developing machine learning products. He has a Masters from Indian Institute of TechnologyMadras. He has published articles in peer-reviewed journals and conferences and submitted applications for several patents in the area of machine learning. In his spare time, he coaches programming and machine learning to school students and engineers.
Read more about Rajalingappaa Shanmugamani