You're reading from Deep Learning with MXNet Cookbook

Product typeBook

Published inDec 2023

Reading LevelBeginner

PublisherPackt

ISBN-139781800569607

Edition1st Edition

Languages

Python

Tools

MXNet

Concepts

Machine Learning

Author (1)

Andrés P. Torres

Analyzing Images with Computer Vision

Computer vision is one of the fields in which deep learning has progressed enormously, surpassing human-level performance in several tasks such as image classification and object recognition. Furthermore, the field has moved from academia to real-world applications, and the industry is recognizing its practitioners as adding high value to businesses.

In this chapter, we will learn how to use GluonCV, a MXNet Gluon library specific to computer vision, how to build our own networks, and how to use GluonCV’s model zoo to use pretrained models for several applications.

Specifically, we will cover the following topics:

Understanding convolutional neural networks
Classifying images with AlexNet and ResNet
Detecting objects with Faster R-CNN and YOLO
Segmenting objects in images with PSPNet and DeepLab-v3

Technical requirements

Apart from the technical requirements specified in the Preface, the following technical requirements apply in this chapter:

Ensure that you have completed Installing MXNet, Gluon, GluonCV and GluonNLP, the first recipe from Chapter 1, Up and Running with MXNet
Ensure that you have completed A toy dataset for regression – load, manage, and visualize a house sales dataset, the first recipe from Chapter 2, Working with MXNet and Visualizing Datasets: Gluon and DataLoader

The code for this chapter can be found at the following GitHub URL: https://github.com/PacktPublishing/Deep-Learning-with-MXNet-Cookbook/tree/main/ch05.

Furthermore, you can access each recipe directly from Google Colab – for example, for the first recipe of this chapter: https://colab.research.google.com/github/PacktPublishing/Deep-Learning-with-MXNet-Cookbook/blob/main/ch05/5_1_Understanding_Convolutional_Neural_Networks.ipynb.

Understanding convolutional neural networks

In the previous chapters, we have used fully connected Multi-Layer Perceptron (MLP) networks to solve our regression and classification problem. However, as we will see, these networks are not optimal for solving image-related problems.

Images are highly dimensional entities – for example, each pixel in a color image has three features (red, green, and blue values), and a 1,024x1,024 image has more than 1 million pixels (a 1 megapixel image) and, therefore, more than 3 million features (3 * 106). If we connect all these points in the input layer, to a second layer of 100 neurons for a fully connected network, we will require more than 108 parameters, and that would be only for the first layer. Processing images is, therefore, a time-intensive operation.

Furthermore, imagine that we are trying to detect eyes in faces; if a pixel belongs to an eye, the likelihood of nearby pixels belonging to the eye is very high (think of the...

Classifying images with MXNet – GluonCV Model Zoo, AlexNet, and ResNet

MXNet provides a variety of tools to compose custom deep learning models. In this recipe, we will see how to use MXNet to build a model from scratch, train it, and use it to classify images from a dataset. We will also see that although this approach works fine, it is time-consuming.

Another option, and one of the highest value features that MXNet and GluonCV provide, is their Model Zoo. GluonCV Model Zoo is a set of pre-trained, ready-to-go models, for use with your own applications. We will see how to use Model Zoo with two very important models for image classification – AlexNet and ResNet.

In this recipe, we will analyze and compare these approaches to classify images on a reduced version of the Dogs vs. Cats dataset.

Getting ready

As with previous chapters, in this recipe, we will use a few matrix operations and linear algebra, but it will not be too difficult.

Furthermore, we will...

Detecting objects with MXNet – Faster R-CNN and YOLO

In this recipe, we will see how to use MXNet and GluonCV on a pre-trained model to detect objects from a dataset. We will see how to use GluonCV Model Zoo with two very important models for object detection – Faster R-CNN and YOLOv3.

In this recipe, we will compare the performance of these two pre-trained models to detect objects on the Penn-Fudan Pedestrians dataset.

Getting ready

As for previous chapters, in this recipe, we will be using a few matrix operations and linear algebra, but it will not be too difficult.

As we will unpack in this recipe, object detection combines classification and regression, and therefore, chapters and recipes where we explored the foundations of these topics are recommended to revisit. Furthermore, we will be detecting objects on image datasets. This recipe will combine what we learned in the following chapters:

Understanding image datasets: load, manage, and visualize...

Segmenting objects in images with MXNet – PSPNet and DeepLab-v3

In this recipe, we will see how to use MXNet and GluonCV on a pre-trained model, segmenting objects in images from a dataset. This means that we will be able to split objects into different classes, such as person, cat, and dog. When framing the problem as segmentation, the expected output is an image of the same size as the input image, with each pixel value being the classified label (we will analyze how this works in the following sections). We will see how to use GluonCV Model Zoo with two very important models for semantic segmentation – PSPNet and DeepLab-v3.

In this recipe, we will compare the performance of these two pre-trained models to segment objects semantically on the dataset introduced in the previous chapter, Penn-Fudan Pedestrians, as its ground-truth also includes segmentation masks.

Getting ready

As with previous chapters, in this recipe, we will use a few matrix operations and...

The rest of the chapter is locked

You have been reading a chapter from

Deep Learning with MXNet Cookbook

Published in: Dec 2023Publisher: PacktISBN-13: 9781800569607

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Author (1)

Andrés P. Torres

Andrés P. Torres, is the Head of Perception at Oxa, a global leader in industrial autonomous vehicles, leading the design and development of State-Of The-Art algorithms for autonomous driving. Before, Andrés had a stint as an advisor and Head of AI at an early-stage content generation startup, Maekersuite, where he developed several AI-based algorithms for mobile phones and the web. Prior to this, Andrés was a Software Development Manager at Amazon Prime Air, developing software to optimize operations for autonomous drones.
Read more about Andrés P. Torres

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages