You're reading from Hands-On Neural Networks with TensorFlow 2.0

Product typeBook

Published inSep 2019

Reading LevelExpert

PublisherPackt

ISBN-139781789615555

Edition1st Edition

Languages

Python

Tools

TensorFlow

Concepts

Neural Networks

Author (1)

Paolo Galeone

Introduction to Object Detection

Detecting and classifying objects in images is a challenging problem. So far, we have treated the issue of image classification on a simple level; in a real-life scenario, we are unlikely to have pictures containing just one object. In industrial environments, it is possible to set up cameras and mechanical supports to capture images of single objects. However, even in constrained environments, such as an industrial one, it is not always possible to have such a strict setup. Smartphone applications, automated guided vehicles, and, more generally, any real-life application that uses images captured in a non-controlled environment require the simultaneous localization and classification of several objects in the input images. Object detection is the process of localizing an object into an image by predicting the coordinates of a bounding box that...

Getting the data

Object detection is a supervised learning problem that requires a considerable amount of data to reach good performance. The process of carefully annotating images by drawing bounding boxes around the objects and assigning them the correct labels is a time-consuming process that requires several hours of repetitive work.

Fortunately, there are already several datasets for object detection that are ready to use. The most famous is the ImageNet dataset, immediately followed by the PASCAL VOC 2007 dataset. To be able to use ImageNet, dedicated hardware is required since its size and number of labeled objects per image makes the object detection task hard to tackle.

PASCAL VOC 2007, instead, consists of only 9,963 images in total, each of them with a different number of labeled objects belonging to the 20 selected object classes. The twenty object classes are as follows...

Object localization

Convolutional neural networks (CNNs) are extremely flexible objects—so far, we have used them to solve classification problems, making them learn to extract features specific to the task. As shown in Chapter 6, Image Classification Using TensorFlow Hub, the standard architecture of CNNs designed to classify images is made of two parts—the feature extractor, which produces a feature vector, and a set of fully connected layers that classifies the feature vector in the (hopefully) correct class:

The classifier placed on top of the feature vector can also be seen as the head of the network

The fact that, so far, CNNs have only been used to solve classification problems should not mislead us. These types of networks are extremely powerful, and, especially in their multilayer setting, they can be used to solve many different kinds of problems, extracting...

Classification and localization

An architecture like the one defined so far that has no information about the class of the object it's localizing is called a region proposal.

It is possible to perform object detection and localization using a single neural network. In fact, there is nothing stopping us adding a second head on top of the feature extractor and training it to classify the image and at the same time training the regression head to regress the bounding box coordinates.

Solving multiple tasks at the same time is the goal of multitask learning.

Multitask learning

Rich Caruna defines multi-task learning in his paper Multi-task learning (1997):

"Multitask Learning is an approach to inductive transfer that...

Summary

In this chapter, the problem of object detection was introduced and some basic solutions were proposed. We first focused on the data required and used TensorFlow datasets to get the PASCAL VOC 2007 dataset ready to use in a few lines of code. Then, the problem of using a neural network to regress the coordinate of a bounding box was looked at, showing how a convolutional neural network can be easily used to produce the four coordinates of a bounding box, starting from the image representation. In this way, we build a region proposal, that is, a network able to suggest where in the input image a single object can be detected, without producing other information about the detected object.

After that, the concept of multi-task learning was introduced and how to add a classification head next to the regression head was shown by using the Keras functional API. Then, we covered...

Exercises

You can answer all the theoretical questions and, perhaps more importantly, struggle to solve all the code challenges that each exercise contains:

In the Getting the data section, a filtering function was applied to the PASCAL VOC 2007 dataset to select only the images with a single object inside. The filtering process, however, doesn't take into account the class balancement.
Create a function that, given the three filtered datasets, merges them first and then creates three balanced splits (with a tolerable class imbalance, if it is not possible to have them perfectly balanced).
Use the splits created in the previous point to retrain the network for localization and classification defined in the chapter. How and why do the performances change?
What measures the Intersection over Union metric?

What does an IoU value of 0.4 represent? A good or a bad match?
What...

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Neural Networks with TensorFlow 2.0

Published in: Sep 2019Publisher: PacktISBN-13: 9781789615555

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Paolo Galeone

Paolo Galeone is a computer engineer with strong practical experience. After getting his MSc degree, he joined the Computer Vision Laboratory at the University of Bologna, Italy, as a research fellow, where he improved his computer vision and machine learning knowledge working on a broad range of research topics. Currently, he leads the Computer Vision and Machine Learning laboratory at ZURU Tech, Italy. In 2019, Google recognized his expertise by awarding him the title of Google Developer Expert (GDE) in Machine Learning. As a GDE, he shares his passion for machine learning and the TensorFlow framework by blogging, speaking at conferences, contributing to open-source projects, and answering questions on Stack Overflow.
Read more about Paolo Galeone

Other recommended products

Related to this chapter

What's New in TensorFlow 2.0

This book will cover all the new features that have been introduced in TensorFlow 2.0 especially the major highlight, including eager execution and more. You will learn how to make the best use of these features to migrate your codes from TensorFlow 1.x to TensorFlow 2.0 in a seamless way.

BookAug 2019202 pages

Hands-On Computer Vision with TensorFlow 2

Computer vision is achieving a new frontier of capabilities in fields like health, automobile or robotics. This book explores TensorFlow 2, Google's open-source AI framework, and teaches how to leverage deep neural networks for visual tasks. It will help you acquire the insight and skills to be a part of the exciting advances in computer vision.

BookMay 2019372 pages

Learn TensorFlow Enterprise

This book is a comprehensive introduction for those who are new to scalable and optimized TensorFlow for production. You will learn how to deliver enterprise-grade support for your existing and newly built AI applications. You will address the various needs of AI-enabled organizations to manage and scale machine learning workloads in production.

BookNov 2020314 pages

TensorFlow 2.0 Quick Start Guide

TensorFlow is one of the most popular machine learning frameworks in Python. With this book, you will improve your knowledge of some of the latest TensorFlow features and will be able to perform supervised and unsupervised machine learning and also train neural networks.

BookMar 2019196 pages

PyTorch Computer Vision Cookbook

This book enables you to solve the trickiest of problems in computer vision using deep learning algorithms and techniques. You will learn to use several different algorithms for different CV problems such as classification, detection, segmentation, and more using Pytorch. Packed with best practices in training and deployment of CV applications.

BookMar 2020364 pages

Generative Adversarial Networks Projects

In this book, we will use different complexities of datasets in order to build end-to-end projects. With every chapter, the level of complexity and operations will become advanced. It consists of 8 full-fledged projects covering approaches such as 3D-GAN, Age-cGAN, DCGAN, SRGAN, StackGAN, and CycleGAN with real-world use cases.

BookJan 2019316 pages

Hands-On Generative Adversarial Networks with Keras

This book will explore deep learning and generative models, and their applications in artificial intelligence. You will learn to evaluate and improve your GAN models by eliminating challenges that are encountered in real-world applications. You will implement GAN architectures in various domains such as computer vision, NLP, and audio processing

BookMay 2019272 pages

TensorFlow 2.0 Computer Vision Cookbook

This book covers recipes for solving various computer vision tasks using TensorFlow, taking you through all the tips and tricks you need to overcome any challenges that you may face while building various computer vision applications. You will discover machine learning techniques to solve problems in image processing, feature extraction, and more.

BookFeb 2021542 pages

Deep Learning for Computer Vision

Deep learning has shown its power in several application areas of Artificial Intelligence, especially in Computer Vision, the science of manipulating and processing images. In this book, you will learn different techniques in deep learning to accomplish tasks related to object classification, object detection, image segmentation, captioning, image generation, and more. You will also explore their application using the popular Python libraries such as TensorFlow and Keras. With practical examples, you will learn to develop Computer Vision applications by leveraging the power of deep learning.

BookJan 2018310 pages

Python Deep Learning Cookbook

Deep Learning is a rapidly evolving field of Machine Learning science which gives machines the ability to learn from information. This book contains detailed recipes to tackle with the common and not so common problems while dealing with deep learning algorithms and models in Python. You will benefit from this book by finding technical solutions to the issues presented, along with a detailed explanation of the solutions, and a discussion on corresponding pros and cons of implementing the proposed solution using Theano, Tensorflow, MXNet, and Keras. You'll come across recipes on data pre-processing, network models and topologies, supervised and unsupervised learning presented in a “solution to problem” fashion.

BookOct 2017330 pages

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

TensorFlow: Powerful Predictive Analytics with TensorFlow

Predictive analytics discovers hidden patterns from structured and unstructured data for automated decision making in business intelligence. Predictive decisions are becoming a huge trend worldwide, catering to wide industry sectors by predicting which decisions are more likely to give maximum results. TensorFlow, Google’s brainchild, is immensely popular and extensively used for predictive analysis.

BookMar 2018164 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages