You're reading from Automated Machine Learning with Microsoft Azure

Product typeBook

Published inApr 2021

PublisherPackt

ISBN-139781800565319

Edition1st Edition

Tools

Azure Functions

Concepts

Machine Learning

Author (1)

Dennis Michael Sawyers

Chapter 5: Building an AutoML Classification Solution

After building your AutoML regression solution with Python in Chapter 4, Building an AutoML Regression Solution, you should be feeling confident in your coding abilities. In this chapter, you will build a classification solution. Unlike regression, classification is used to predict the category of the object of interest. For example, if you're trying to predict who is likely to become a homeowner in the next five years, classification is the right machine learning approach.

Binary classification is when you are trying to predict two classes, such as homeowner or not, while multiclass classification involves trying to predict three or more classes, such as homeowner, renter, or lives with family. You can utilize both of these techniques with Azure AutoML, and this chapter will teach you how to train both kinds of models using different datasets.

In this chapter, you will begin by navigating directly to the Jupyter environment...

Technical requirements

For this chapter, you will be building models with Python code in Jupyter notebooks through Azure Machine Learning (AML) studio. Furthermore, you will be using datasets and Azure resources that you should have created in previous chapters. As such, the full list of requirements is as follows:

Access to the internet
A web browser, preferably Google Chrome or Microsoft Edge Chromium
A Microsoft Azure account
An Azure Machine Learning workspace
The titanic-compute-instance compute instance created in Chapter 2, Getting Started with Azure Machine Learning
The compute-cluster compute cluster created in Chapter 2, Getting Started with Azure Machine Learning
The Titanic Training Data dataset from Chapter 3, Training your First AutoML Model
An understanding of how to navigate to the Jupyter environment from an Azure compute instance as demonstrated in Chapter 4, Building an AutoML Regression Solution

Prepping data for AutoML classification

Classification, or predicting the category of something based on its attributes, is one of the key techniques of machine learning. Just like regression, you first need to prep your data before training it with AutoML. In this section, you will first navigate to your Jupyter notebook, load in your data, and transform it for use with AutoML.

Just as you loaded in your Diabetes Sample dataset via Jupyter notebooks for regression, you will do the same with the Titanic Training Data dataset. However, this time around you will do much more extensive data transformation before training your AutoML model. This is to build upon your learning; classification datasets do not necessarily require more transformation than their regression counterparts. Identical to the previous chapter, you will begin by opening up a Jupyter notebook from your compute instance.

Navigating to your Jupyter environment

Similar to Chapter 4, Building an AutoML Regression...

Training an AutoML classification model

Training an AutoML classification model is very similar to training an AutoML regression model, but there are a few key differences. In Chapter 4, Building an AutoML Regression Solution, you began by setting a name for your experiment. After that, you set your target column and subsequently set your AutoML configurations. Finally, you used AutoML to train a model, performed a data guardrails check, and produced results.

All of the steps in this section are nearly the same. However, pay close attention to the data guardrails check and results, as they are substantially different when training classification models:

Set your experiment and give it a name:

experiment_name = 'Titanic-Transformed-Classification'
exp = Experiment(workspace=ws, name=experiment_name)

Set your dataset to your transformed Titanic data:

dataset_name = "Titanic Transformed"
dataset = Dataset.get_by_name(ws, dataset_name, version=&apos...

Registering your trained classification model

The code to register classification models is identical to the code you used in Chapter 4, Building an AutoML Regression Solution, to register your regression model. Always register new models, as you will use them to score new data using either real-time scoring endpoints or batch execution inference pipelines depending on your use case. This will be explained in Chapter 9, Implementing a Batch Scoring Solution, and Chapter 11, Implementing a Real-Time Scoring Solution. Likewise, when registering your models, always add tags and descriptions for easier tracking:

First, give your model a name, a description, and some tags:

description = 'Best AutoML Classification Run using Transformed Titanic Data.' 
tags = {'project' : "Titanic", "creator" : "your name"} 
model_name = 'Titanic-Transformed-Classification-AutoML'

Tags let you easily search for models, so think carefully...

Training an AutoML multiclass model

Multiclass classification involves predicting three or more classes instead of the standard binary classification. Using custom machine learning, training multiclass models is often a messy, complicated affair where you have to carefully consider the number of classes you are trying to predict, how unbalanced those classes are relative to each other, whether you should combine classes together, and how you should present your results. Luckily, AutoML takes care of all these considerations for you and makes training a multiclass model as simple as training a binary classification model.

In this section, you load in data using the publicly available Iris dataset. You will then set your AutoML classifications for multiclass classification, train and register a model, and examine your results. You will notice that much of the code is identical to the last section. By understanding the differences between binary and multiclass classification in AutoML...

Fine-tuning your AutoML classification model

In this section, you will first review tips and tricks for improving your AutoML classification models and then review the algorithms used by AutoML for both binary and multiclass classification.

Improving AutoML classification models

Keeping in mind the tips and tricks from Chapter 4, Building an AutoML Regression Solution, here are new ones that are specific to classification:

Unlike regression problems, nearly all classification problems in the real world require you to weigh your target column. The reason is that, for most business problems, one class is nearly always more important than the others.
For example, imagine you are running a business and you are trying to predict which customers will stop doing business with you and leave you for a competitor. This is a common problem called customer churn or customer turnover. If you misidentify a customer as being likely to churn, all you waste is an unnecessary phone call...

Summary

You have added to your repertoire by successfully training a classification model using the AML Python SDK. You have loaded in data, heavily transformed it using pandas and Numpy, and built a toy AutoML model. You then registered that model to your AMLS workspace.

You can now start building classification models with your own data. You can easily solve both binary and multiclass classification problems, and you can present results to the business in a way they understand with confusion matrices. Many of the most common business problems, such as customer churn, are classification problems, and with the knowledge you learned in this chapter, you can solve those problems and earn trust and respect in your organization.

The next chapter, Chapter 6, Building an AutoML Forecasting Solution, will be vastly different from the previous two chapters. Forecasting problems have many more settings to use and understand compared to classification and regression problems, and they...

The rest of the chapter is locked

You have been reading a chapter from

Automated Machine Learning with Microsoft Azure

Published in: Apr 2021Publisher: PacktISBN-13: 9781800565319

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Dennis Michael Sawyers

Dennis Michael Sawyers is a senior cloud solutions architect (CSA) at Microsoft, specializing in data and AI. In his role as a CSA, he helps Fortune 500 companies leverage Microsoft Azure cloud technology to build top-class machine learning and AI solutions. Prior to his role at Microsoft, he was a data scientist at Ford Motor Company in Global Data Insight and Analytics (GDIA) and a researcher in anomaly detection at the highly regarded Carnegie Mellon Auton Lab. He received a master's degree in data analytics from Carnegie Mellon's Heinz College and a bachelor's degree from the University of Michigan. More than anything, Dennis is passionate about democratizing AI solutions through automated machine learning technology.
Read more about Dennis Michael Sawyers

Other recommended products

Related to this chapter

Azure Data Factory Cookbook

With the help of well-structured and practical recipes, this book will teach you how to integrate data from the cloud and on-premise. You’ll learn how to transform, clean, and consolidate data into a single data platform and get to grips with using ADF as the main ETL and orchestration tool for your data warehouse or data platform project.

BookDec 2020382 pages

Automated Machine Learning

This guide will help you to explore automated machine learning (AutoML), a rapidly growing subfield of machine learning. You’ll learn how you can use AutoML to fully automate the machine learning process even if you’re not an expert, and in turn increase your productivity drastically.

BookFeb 2021312 pages

Mastering Azure Machine Learning

This book will help you learn how to build a scalable end-to-end machine learning pipeline in Azure from experimentation and training to optimization and deployment. By the end of this book, you will learn to build complex distributed systems and scalable cloud infrastructure using powerful machine learning algorithms to compute insights.

BookApr 2020436 pages

Engineering MLOps

Get to grips with ML lifecycle management and MLOps implementation for your organization. This book will give you comprehensive insights into MLOps coupled with real-world examples in Azure that will teach you how to write programs, train robust and scalable ML models, and build ML pipelines to train, deploy, and monitor models securely in production.

BookApr 2021370 pages

Limitless Analytics with Azure Synapse

This book helps you understand the basic concepts and techniques of using Azure Synapse step-by-step. You'll gradually gain the skills you need to work with data and develop analytics solutions using the Azure analytics platform even with no prior knowledge of Azure.

BookJun 2021392 pages

Hands-On Data Warehousing with Azure Data Factory

Azure Data Factory (ADF) is a Microsoft Azure PaaS solution which supports data movement between many on premises and cloud data sources. This book covers custom tailored tutorials to help you develop , maintain and troubleshoot data movement processes and environments using Azure Data Factory V2 and SQL Server Integration Services 2017

BookMay 2018284 pages

Cloud Analytics with Microsoft Azure

Cloud Analytics with Microsoft Azure is an end-to-end guide to processing and analyzing big data using a range of Microsoft Azure features. This book covers everything you need to build your own data warehouse and learn numerous techniques to gain useful insights by analyzing big data.

BookNov 2019242 pages

Hands-On Machine Learning with Azure

This book will teach you how advanced machine learning can be performed in the cloud in a very cheap way. You will learn more about Azure ML processes as an enterprise-ready methodology. By the end of this book, you will implement machine learning and artificial intelligence concepts in your model to solve real-world problems.

BookOct 2018340 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages