You're reading from Deep Learning with MXNet Cookbook

Product typeBook

Published inDec 2023

Reading LevelBeginner

PublisherPackt

ISBN-139781800569607

Edition1st Edition

Languages

Python

Tools

MXNet

Concepts

Machine Learning

Author (1)

Andrés P. Torres

Working with MXNet and Visualizing Datasets – Gluon and DataLoader

In the previous chapter, we learned how to set up MXNet. We also verified how MXNet could leverage our hardware to provide maximum performance. Before applying deep learning (DL) to solve specific problems, we need to understand how to load, manage, and visualize the datasets we will be working with. In this chapter, we will start using MXNet to analyze some toy datasets in the domains of numerical regression, data classification, image classification, and text classification. To manage those tasks efficiently, we will see new MXNet libraries and functions such as Gluon (an API for DL) and DataLoader.

In this chapter, we will cover the following topics:

Understanding regression datasets – loading, managing, and visualizing the House Sales dataset
Understanding classification datasets – loading, managing, and visualizing the Iris dataset
Understanding image datasets – loading...

Technical requirements

Apart from the technical requirements specified in the Preface, no other requirements apply to this chapter.

The code for this chapter can be found at the following GitHub URL: https://github.com/PacktPublishing/Deep-Learning-with-MXNet-Cookbook/tree/main/ch02

Furthermore, you can directly access each recipe from Google Colab; for example, for the first recipe of this chapter, visit https://colab.research.google.com/github/PacktPublishing/Deep-Learning-with-MXNet-Cookbook/blob/main/ch02/2_1_Toy_Dataset_for_Regression_Load_Manage_and_Visualize_House_Sales_Dataset.ipynb.

Understanding regression datasets – loading, managing, and visualizing the House Sales dataset

The training process of machine learning (ML) models can be divided into three main sub-groups:

Supervised learning (SL): The expected outputs are known for at least some data
Unsupervised learning (UL): The expected outputs are not known but the data has some features that could help with understanding its internal distribution
Reinforcement learning (RL): An agent explores the environment and makes decisions based on the inputs acquired from the environment

There is also an approach that falls in between the first two sub-groups called weakly SL, where there are not enough known outputs to follow an SL approach for one of the following reasons:

The outputs are inaccurate
Only some of the output features are known (incomplete)
They are not exactly the expected outputs but are connected/related to the task we intend to achieve (inexact)

...

Understanding classification datasets – loading, managing, and visualizing the Iris dataset

In the previous recipe, we studied one of the most common problem types in SL: regression. In this recipe, we will take a closer look at another of these problem types: classification.

In classification problems, we want to estimate a categorial output, a class, from a set of given classes, using a variable number of input features. In this recipe, we will analyze a toy classification dataset from Kaggle: the Iris dataset, one of the most renowned classification datasets.

The Iris dataset presents the problem of estimating the iris class of the flower of plants, from three classes (iris setosa, iris versicolor, and iris virginica) with the help of the following four features:

Sepal length (in cm)
Sepal width (in cm)
Petal length (in cm)
Petal width (in cm)

These data features are provided for 150 flowers, with 50 instances for each of the 3 classes (making...

Understanding image datasets – loading, managing, and visualizing the Fashion-MNIST dataset

One of the fields that has grown considerably in DL in the last years has been computer vision (CV). Since the AlexNet revolution in 2012, CV has expanded from lab research to surpassing human performance in real-world datasets (known as “in the wild”).

In this recipe, we will explore the simplest CV task: image classification. Given a set of images, our task is to correctly classify that image among a given set of labels (classes).

One of the most classic image classification datasets is the MNIST (which stands for the Modified National Institute of Standards and Technology) database. Similarly sized, but more suited for current CV analysis, is the Fashion-MNIST dataset. This dataset is a multi-label image classification dataset, with a training set of 60k examples and a test set of 10k examples, with each example belonging to 1 of these 10 categories (starting with...

Understanding text datasets – loading, managing, and visualizing the Enron Email dataset

Another field that has grown considerably in DL in recent years is natural language processing (NLP). Similarly to CV, this field aims to surpass human performance in real-world datasets.

In this recipe, we will explore one of the simplest NLP tasks: text classification. Given a set of sentences and paragraphs, our task is to correctly classify that text among a given set of labels (classes).

One of the most classic text classification tasks is to distinguish whether received email is spam or not (ham). These datasets are binary text classification datasets (only two labels to assign, 0 and 1, or ham and spam).

In our specific scenario, we will use a real-world email dataset. This set of emails was made public during the investigation of the Enron scandal in the early 2000s by the US Government. This dataset was first published in 2004 and is composed of emails from ~150 users,...

The rest of the chapter is locked

You have been reading a chapter from

Deep Learning with MXNet Cookbook

Published in: Dec 2023Publisher: PacktISBN-13: 9781800569607

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Andrés P. Torres

Andrés P. Torres, is the Head of Perception at Oxa, a global leader in industrial autonomous vehicles, leading the design and development of State-Of The-Art algorithms for autonomous driving. Before, Andrés had a stint as an advisor and Head of AI at an early-stage content generation startup, Maekersuite, where he developed several AI-based algorithms for mobile phones and the web. Prior to this, Andrés was a Software Development Manager at Amazon Prime Air, developing software to optimize operations for autonomous drones.
Read more about Andrés P. Torres

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages