You're reading from Azure Data Scientist Associate Certification Guide

Product type Book

Published in Dec 2021

Publisher Packt

ISBN-13 9781800565005

Pages 448 pages

Edition 1st Edition

Languages

Python

Concepts

Machine Learning

Authors (2):

Andreas Botsikas

Michael Hlobil

View More author details

Table of Contents (17) Chapters

Preface

Section 1: Starting your cloud-based data science journey

Chapter 1: An Overview of Modern Data Science

Chapter 2: Deploying Azure Machine Learning Workspace Resources

Chapter 3: Azure Machine Learning Studio Components

Chapter 4: Configuring the Workspace

Section 2: No code data science experimentation

Chapter 5: Letting the Machines Do the Model Training

Chapter 6: Visual Model Training and Publishing

Section 3: Advanced data science tooling and capabilities

Chapter 7: The AzureML Python SDK

Chapter 8: Experimenting with Python Code

Chapter 9: Optimizing the ML Model

Chapter 10: Understanding Model Results

Chapter 11: Working with Pipelines

Chapter 12: Operationalizing Models with Code

Other Books You May Enjoy

Chapter 5: Letting the Machines Do the Model Training

In this chapter, you will create your first Automated Machine Learning (Automated ML or AutoML) experiment. AutoML refers to the process of trying multiple modeling techniques and selecting the model that produces the best predictions against the training dataset you specify. First, you will navigate through the AutoML wizard that is part of the Azure Machine Learning Studio web experience and understand the different options that need to be configured. You will then learn how to monitor the progress of an AutoML experiment and how to deploy the best-produced model as a web service hosted in an Azure Container Instance (ACI) to be able to make real-time inferences.

The best way to go through this chapter is by sitting in front of a computer with this book by you. By using your Azure subscription and this book together, you can start your journey through AutoML.

In this chapter, we're going to cover the following main...

Technical requirements

You will need to have access to an Azure subscription. Within that subscription, you will need a resource group named packt-azureml-rg. You will need to have either a Contributor or Owner Access control (IAM) role at the resource group level. Within that resource group, you should then deploy a machine learning resource named packt-learning-mlw, as described in Chapter 2, Deploying Azure Machine Learning Workspace Resources.

Configuring an AutoML experiment

If you were asked to train a model to make predictions against a dataset, you would need to do a couple of things, including normalizing the dataset, splitting it into train and validation data, running multiple experiments to understand which algorithm is performing best against the dataset, and then finetuning the best model. Automated machine learning shortens this process by fully automating the time-consuming, iterative tasks. It allows all users, from normal PC users to experienced data scientists, to build multiple machine learning models against a target dataset and select the model that performs the best, based on a metric you select.

This process consists of the following steps:

Preparing the experiment: Select the dataset you are going to use for training, select the column that you are trying to predict, and configure the experiment's parameters. This is the configuration phase you will read about in this section.
Data...

Monitoring the execution of the experiment

In the previous section, Configuring an Automated ML experiment, you submitted an AutoML experiment to execute on a remote compute cluster. Once you have submitted the job, your browser should redirect you to a page similar to the following:

Figure 5.15 – Running a new Automated ML run for the first time since the run finished

At the top of the page, the name of the run of the experiment is autogenerated. In the preceding screenshot, it is called AutoML_05558d1d-c8ab-48a5-b652-4d47dc102d29. By clicking the pencil icon, you can edit this name and make it something more memorable. Change the name to my-first-experiment-run. Right below the run's name, you can click on one of the following commands:

Refresh: This will refresh the information provided on the page. While running an experiment, you can get the latest and greatest information.
Generate notebook: This will create a notebook with all...

Deploying the best model as a web service

In the previous section, you navigated around the run experiment page while reviewing the information related to the run execution and the results of the exploration, which are the trained models. In this section, we will revisit the Models tabs and start deploying the best model as a web service to be able to make real-time inferences. Navigate to the run's details page, as shown in Figure 5.15. Let's get started:

Click on the Models tab. You should see a page similar to the one shown here:
Figure 5.16 – The Models tab as a starting point for deploying a model
In this list, you can select any model you want to deploy. Select the row with the best model, as shown in the preceding screenshot. Click the Deploy command at the top of the list. The Deploy a model dialog will appear, as shown here:
Figure 5.17 – The Deploy a model dialogue
In the Deploy a model dialog, you will be able to define a deployment...

Summary

In this chapter, you learned how to configure an AutoML process to discover the best model that can predict whether a customer will churn or not. First, you used the AutoML wizard of the Azure Machine Learning Studio web experience to configure the experiment. Then, you monitored the execution of the run in the Experiments section of the studio interface. Once the training was completed, you reviewed the trained models and saw the information that had been stored regarding the best model. Then, you deployed that machine learning model in an Azure Container Instance and tested that the real-time endpoint performs the requested inferences. In the end, you deleted the deployment to avoid incurring costs in your Azure subscription.

In the next chapter, you will continue exploring the no-code/low code aspects of the Azure Machine Learning Studio experience by looking at the designer, which allows you to graphically design a training pipeline and operationalize the produced model...

Question

You need to train a classification model but only consider linear models during the AutoML process. Which of the following allows you to do that in the Azure Machine Learning Studio experience?

a) Add all algorithms other than linear ones to the blocked algorithms list.

b) Set the Exit criterion option to a metric score threshold.

c) Disable the automatic featurization option.

d) Disable the deep learning option on the classification task.