Reader small image

You're reading from  Deep Learning for Computer Vision

Product typeBook
Published inJan 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781788295628
Edition1st Edition
Languages
Right arrow
Author (1)
Rajalingappaa Shanmugamani
Rajalingappaa Shanmugamani
author image
Rajalingappaa Shanmugamani

Rajalingappaa Shanmugamani is currently working as an Engineering Manager for a Deep learning team at Kairos. Previously, he worked as a Senior Machine Learning Developer at SAP, Singapore and worked at various startups in developing machine learning products. He has a Masters from Indian Institute of TechnologyMadras. He has published articles in peer-reviewed journals and conferences and submitted applications for several patents in the area of machine learning. In his spare time, he coaches programming and machine learning to school students and engineers.
Read more about Rajalingappaa Shanmugamani

Right arrow

Preface

Deep Learning for Computer Vision is a book intended for readers who want to learn deep-learning-based computer vision techniques for various applications. This book will give the reader tools and techniques to develop computer-vision-based products. There are plenty of practical examples covered in the book to follow the theory.

Who this book is for

The reader wants to know how to apply deep learning to computer vision problems such as classification, detection, retrieval, segmentation, generation, captioning, and video classification. The reader also wants to understand how to achieve good accuracy under various constraints such as less data, imbalanced classes, and noise. Then the reader also wants to know how to deploy trained models on various platforms (AWS, Google Cloud, Raspberry Pi, and mobile phones). After completing this book, the reader should be able to develop code for problems of person detection, face recognition, product search, medical image segmentation, image generation, image captioning, video classification, and so on.

What this book covers

Chapter 1, Getting Started, introduces the basics of deep learning and makes the readers familiar with the vocabulary. The readers will install the software packages necessary to follow the rest of the chapters. 

Chapter 2Image Classification, talks about the image classification problem, which is labeling an image as a whole. The readers will learn about image classification techniques and train a deep learning model for pet classification. They will also learn methods to improve accuracy and dive deep into variously advanced architectures.

Chapter 3, Image Retrieval, covers deep features and image retrieval. The reader will learn about various methods of obtaining model visualization, visual features, inference using TensorFlow, and serving and using visual features for product retrieval.

Chapter 4, Object Detection, talks about detecting objects in images. The reader will learn about various techniques of object detection and apply them for pedestrian detection. The TensorFlow API for object detection will be utilized in this chapter.

Chapter 5, Semantic Segmentation, covers segmenting of images pixel-wise. The readers will earn about segmentation techniques and train a model for segmentation of medical images.

Chapter 6, Similarity Learning, talks about similarity learning. The readers will learn about similarity matching and how to train models for face recognition. A model to train facial landmark is illustrated.

Chapter 7, Image Captioning, is about generating or selecting captions for images. The readers will learn natural language processing techniques and how to generate captions for images using those techniques.

Chapter 8Generative Models, talks about generating synthetic images for various purposes. The readers will learn what generative models are and use them for image generation applications, such as style transfer, training data, and so on.

Chapter 9, Video Classification, covers computer vision techniques for video data. The readers will understand the key differences between solving video versus image problems and implement video classification techniques.

Chapter 10, Deployment, talks about the deployment steps for deep learning models. The reader will learn how to deploy trained models and optimize for speed on various platforms.

To get the most out of this book

The examples covered in this book can be run with Windows, Ubuntu, or Mac. All the installation instructions are covered. Basic knowledge of Python and machine learning is required. It's preferable that the reader has GPU hardware but it's not necessary.

Download the example code files

You can download the example code files for this book from your account at www.packtpub.com. If you purchased this book elsewhere, you can visit www.packtpub.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

  1. Log in or register at www.packtpub.com.
  2. Select the SUPPORT tab.
  3. Click on Code Downloads & Errata.
  4. Enter the name of the book in the Search box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR/7-Zip for Windows
  • Zipeg/iZip/UnRarX for Mac
  • 7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Deep-Learning-for-Computer-Vision. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

 

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Note that the graph is written once with the summary_writer."

A block of code is set as follows:

merged_summary_operation = tf.summary.merge_all()
train_summary_writer = tf.summary.FileWriter('/tmp/train', session.graph)
test_summary_writer = tf.summary.FileWriter('/tmp/test')

Any command-line input or output is written as follows:

wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Once you are done, terminate the instance by clicking Actions|Instance State|Terminat."

Note

Warnings or important notes appear like this.

Note

Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: Email feedback@packtpub.com and mention the book title in the subject of your message. If you have questions about any aspect of this book, please email us at questions@packtpub.com.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packtpub.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packtpub.com.

 

 

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Deep Learning for Computer Vision
Published in: Jan 2018Publisher: PacktISBN-13: 9781788295628
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Rajalingappaa Shanmugamani

Rajalingappaa Shanmugamani is currently working as an Engineering Manager for a Deep learning team at Kairos. Previously, he worked as a Senior Machine Learning Developer at SAP, Singapore and worked at various startups in developing machine learning products. He has a Masters from Indian Institute of TechnologyMadras. He has published articles in peer-reviewed journals and conferences and submitted applications for several patents in the area of machine learning. In his spare time, he coaches programming and machine learning to school students and engineers.
Read more about Rajalingappaa Shanmugamani