Reader small image

You're reading from  Automated Machine Learning with Microsoft Azure

Product typeBook
Published inApr 2021
PublisherPackt
ISBN-139781800565319
Edition1st Edition
Right arrow
Author (1)
Dennis Michael Sawyers
Dennis Michael Sawyers
author image
Dennis Michael Sawyers

Dennis Michael Sawyers is a senior cloud solutions architect (CSA) at Microsoft, specializing in data and AI. In his role as a CSA, he helps Fortune 500 companies leverage Microsoft Azure cloud technology to build top-class machine learning and AI solutions. Prior to his role at Microsoft, he was a data scientist at Ford Motor Company in Global Data Insight and Analytics (GDIA) and a researcher in anomaly detection at the highly regarded Carnegie Mellon Auton Lab. He received a master's degree in data analytics from Carnegie Mellon's Heinz College and a bachelor's degree from the University of Michigan. More than anything, Dennis is passionate about democratizing AI solutions through automated machine learning technology.
Read more about Dennis Michael Sawyers

Right arrow

Chapter 2: Getting Started with Azure Machine Learning Service

Now that we know that the key to delivering return on investment in artificial intelligence is delivering machine learning (ML) projects at a brisk pace, we need to learn how to use Automated Machine Learning (AutoML) to achieve that goal. Before we can do that, however, we need to learn how to use the Azure Machine Learning Service (AMLS). AMLS is Microsoft's premier ML platform on the Azure cloud.

We will begin this chapter by creating an Azure account and creating an AMLS workspace. Once you have created a workspace, you will proceed to create different types of compute to run Python code and ML jobs remotely using a cluster of machines. Next, you will learn how to work with data using the Azure dataset and datastore constructs. Finally, we will provide an overview of AutoML. This will boost your ability to create high-performing models.

In this chapter, we will cover the following topics:

  • Creating...

Technical requirements

In order to complete the exercises in this chapter, you will need the following:

  • Access to the internet
  • A web browser, preferably Google Chrome or Microsoft Edge Chromium
  • A Microsoft account

Creating your first AMLS workspace

Navigating Microsoft Azure for the first time can be a daunting experience. With hundreds of services with similar capabilities, it's easy to get lost. Therefore, it is important for you to follow this guide step by step, beginning by creating an Azure account. If you already have an Azure account, you can skip ahead to the Creating an AMLS workspace section.

Creating an Azure account

Let's begin:

  1. To create an Azure account, navigate to https://azure.microsoft.com.
  2. Click the green Start free button, as shown in the following screenshot. Depending on your location, this button may be located in a slightly different location. Once you've clicked this button, you will be asked to select an email address associated with your Microsoft account:

    Note

    If you use Microsoft Windows, you should have a Microsoft account. If you do not, then you can create a Microsoft account by following the instructions at https://account.microsoft...

Building compute to run your AutoML jobs

The first time you open AML studio, navigate to the Compute tab to create a compute instance and a compute cluster. Once you open the tab, you will see four headings at the top: Compute instances, Compute clusters, Inference clusters, and Attached compute. Let's take a look at these in more detail:

  • Compute instances are virtual machines that you can use to write and run Python code in Jupyter or JupyterLab notebooks; you can also use a compute instance to write R code using R Studio.
  • Compute clusters are groups of virtual machines used to train ML models remotely. You can kick off jobs on a compute cluster and continue working on code in your compute instance.
  • Inference clusters are groups of virtual machines used to score data in real time.
  • Attached compute refers to using Databricks or HDInsight compute to run big data jobs.

Let's see them in action.

Creating a compute instance

We'll start...

Working with data in AMLS

Now that you've created a compute, all you need to do is create a dataset and you will be ready to run your first AutoML job. Datasets are simply pointers to files on your Storage account or pointers to SQL queries on Azure SQL databases.

A dataset is not a file itself. You can create datasets from local files, from SQL queries, or from files in your storage accounts. Azure Open Datasets, publicly available data curated by Microsoft, can also be registered as datasets. For this exercise, we will create a dataset using the Diabetes open dataset.

Creating a dataset using the GUI

Let's begin:

  1. Click the Dataset tab.
  2. Click the Create dataset button, indicated by the blue cross, and you will be presented with a dropdown. Select From Open Datasets, as shown in the following screenshot.

    Note that you can also use this dropdown to create datasets from local files on your computer, from web files, or from data found in your datastores...

Understanding how AutoML works on Azure

Before running your first AutoML experiment, it's important to understand how AutoML works on Azure. AutoML is more than just machine learning, after all. It's also about data transformation and manipulation.

As shown in the following diagram, you can divide the stages of AutoML into roughly five parts: Data Guardrails Check, Intelligent Feature Engineering, Iterative Data Transformation, Iterative ML Model Building, and Model Ensembling. Only at the end of this process does AutoML produce a definitive best model:

Figure 2.17 – The Azure AutoML process

Let's take a closer look at each step in this process.

Ensuring data quality with data guardrails

Data guardrails check to make sure that your data is in the correct format for AutoML, and if it is not, it will alter the data accordingly. There are currently six main checks that are performed on your data. Two of the checks – one to...

Summary

In this chapter, you have learned about all the prerequisites that are necessary for creating AutoML solutions in Azure. You created an AMLS workspace and accessed AML studio before creating the necessary compute to run and write your AutoML jobs. You then loaded data into a datastore and registered it as a dataset to make it available for your AutoML runs.

Importantly, you should now understand the four steps of the AutoML process: a data guardrails check, intelligent feature engineering, data transformation, and iterative ML model building. Everything you have done in this chapter will enable you to create a ML model in record time.

You are now ready for Chapter 3, Training Your First AutoML Model, where you will build your first AutoML model through a GUI. This chapter will cover a range of topics, from examining data to scoring models and explaining results. By the end of that chapter, you will not only be able to train models with AutoML, but you will also be able...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Automated Machine Learning with Microsoft Azure
Published in: Apr 2021Publisher: PacktISBN-13: 9781800565319
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Dennis Michael Sawyers

Dennis Michael Sawyers is a senior cloud solutions architect (CSA) at Microsoft, specializing in data and AI. In his role as a CSA, he helps Fortune 500 companies leverage Microsoft Azure cloud technology to build top-class machine learning and AI solutions. Prior to his role at Microsoft, he was a data scientist at Ford Motor Company in Global Data Insight and Analytics (GDIA) and a researcher in anomaly detection at the highly regarded Carnegie Mellon Auton Lab. He received a master's degree in data analytics from Carnegie Mellon's Heinz College and a bachelor's degree from the University of Michigan. More than anything, Dennis is passionate about democratizing AI solutions through automated machine learning technology.
Read more about Dennis Michael Sawyers