Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Azure Data Scientist Associate Certification Guide

You're reading from  Azure Data Scientist Associate Certification Guide

Product type Book
Published in Dec 2021
Publisher Packt
ISBN-13 9781800565005
Pages 448 pages
Edition 1st Edition
Languages
Authors (2):
Andreas Botsikas Andreas Botsikas
Profile icon Andreas Botsikas
Michael Hlobil Michael Hlobil
Profile icon Michael Hlobil
View More author details

Table of Contents (17) Chapters

Preface Section 1: Starting your cloud-based data science journey
Chapter 1: An Overview of Modern Data Science Chapter 2: Deploying Azure Machine Learning Workspace Resources Chapter 3: Azure Machine Learning Studio Components Chapter 4: Configuring the Workspace Section 2: No code data science experimentation
Chapter 5: Letting the Machines Do the Model Training Chapter 6: Visual Model Training and Publishing Section 3: Advanced data science tooling and capabilities
Chapter 7: The AzureML Python SDK Chapter 8: Experimenting with Python Code Chapter 9: Optimizing the ML Model Chapter 10: Understanding Model Results Chapter 11: Working with Pipelines Chapter 12: Operationalizing Models with Code Other Books You May Enjoy

Chapter 3: Azure Machine Learning Studio Components

In this chapter, you will explore the Azure Machine Learning Studio (Azure ML Studio) web interface, an immersive experience for managing the end-to-end machine learning life cycle. You will get an overview of the available components that allow you to manage your workspace resources, author machine learning models, and track your assets, including your datasets, trained models, and their published endpoints.

In this chapter, we're going to cover the following main topics:

  • Interacting with the Azure ML resource
  • Exploring the Azure ML Studio experience
  • Authoring experiments within Azure ML Studio
  • Tracking data science assets in Azure ML Studio
  • Managing infrastructure resources in Azure ML Studio

Technical requirements

You will need to have access to an Azure subscription. Within that subscription, you will need a resource group named packt-azureml-rg. You will need to have either a Contributor or Owner access control (IAM) role at the resource group level. Within that resource group, you should deploy a machine learning resource named packt-learning-mlw. These resources should already be available to you if you followed the instructions in Chapter 2, Deploying Azure Machine Learning Workspace Resources.

Interacting with the Azure ML resource

In the previous chapter, you deployed the packt-learning-mlw machine learning resource within the packt-azureml-rg resource group. Navigate to the deployed resource by typing in its name in the top search bar and selecting the resource from the results list:

Figure 3.1 – Navigating to the Azure Machine Learning resource

This will land you on the overview pane of the resource, as shown in Figure 3.2:

  1. On the left-hand side, you will see the typical resource menu that most of the Azure services have. This menu is also referred to as the left pane.
  2. At the top, you will see the command bar, which allows you to download the config.json file, a file that contains all the information you need to connect to the workspace through the Python SDK, and to delete the machine learning workspace.
  3. Below the command bar, you can see the working pane, which is where you can view information related to the workspace...

Exploring the Azure ML Studio experience

Azure Machine Learning comes with a dedicated web interface that allows you to implement both no-code and code-first data science initiatives. You can access the web interface either through the Launch studio button within the Azure portal resource, as you saw in the previous section, or by visiting the https://ml.azure.com page directly. With the latter approach, if this is your first time you've visited the Studio site, you will have to manually select the Azure Active Directory tenant, the Subscription, and the name of the Machine Learning workspace you want to connect to, as shown in Figure 3.3.

Figure 3.3 – Selecting the machine learning workspace in ml.azure.com

Once you've selected your workspace, you will land on the home page of Azure Machine Learning Studio, as shown in Figure 3.4.

Figure 3.4 – Azure Machine Learning Studio home page

On the left-hand side, you...

Authoring experiments within Azure ML Studio

Azure ML Studio provides the following authoring experiences:

  • Notebooks allows you to work with files, folders, and Jupyter Notebooks directly in the workspace. You will be working with notebooks in Chapter 7, The AzureML Python SDK, where you will see the code-first data science process.
  • Automated ML allows you to rapidly test multiple combinations of algorithms against a given dataset and find the best model based on the success metric you define. You will read more about this in Chapter 5, Letting the Machines Do the Model Training.
  • Designer allows you to visually design an experiment by connecting datasets and modules such as data transformation and model training in a flow. By designing this flow on a canvas, you can train and deploy machine learning models without writing any code, something that you will read more about in Chapter 6, Visual Model Training and Publishing.
  • Data Labeling allows you to create labeling...

Tracking data science assets in Azure ML Studio

Within the assets section, you can track all the components that are at the heart of machine learning operations. Every data science project has the following assets:

  • Datasets is where you can find registered datasets. This is a centralized registry where you can register your datasets and avoid colleagues having to work on local copies of the same data or, even worse, subsets of this data. You will work with datasets in Chapter 4, Configuring the Workspace.
  • Experiments is a centralized place to track groups of script executions or runs. When you are training a model, you are logging various aspects of that process, including metrics that you might need to compare performance. To group all attempts under the same context, you should submit all the runs under the same experiment name; then, the results will appear in this area. You will work with experiments in Chapter 5, Letting the Machines Do the Model Training.
  • Pipelines...

Managing infrastructure resources in Azure ML Studio

To conduct an experiment, you will need a couple of infrastructure resources to consume. You can configure and manage them through the following sections:

  • Compute provides the managed compute infrastructure you can use in your experiments. This allows you to register and utilize virtual machines that may have multiple CPUs and GPUs and memory sizes that can load humongous datasets into them. Having those computes as a managed service means that you don't have to worry about installing the operating system or keeping it patched and up to date. You will learn more about the various compute options in Chapter 4, Configuring the Workspace.
  • Datastores contains the connection information needed to get access to the data within various engines, such as Azure Blob Storage and Azure SQL Database. This information is used to access the datasets that you registered in the Compute section. You will learn more about the concepts...

Summary

Azure Machine Learning Studio provides a web environment where you can manage all the artifacts in your Azure Machine Learning workspace. You can view and manage your Jupyter notebooks, datasets, experiments, pipelines, models, and endpoints. You can also manage the compute resources and datastores that will be used in your experiments. Studio also offers interactive tools you can use to perform no-code data science experiments, something you will deep dive into in the next chapters of this book. The AutoML wizard is the first no-code experience that's baked into Azure ML Studio and allows you to run automated machine learning experiments. Azure Machine Learning designer is the next no-code experience and helps you graphically design pipelines and create workflows without writing code. This experience also enables low-code scenarios, where you can drop code snippets if needed. Finally, data labeling projects allow you to create, manage, and monitor tedious projects to...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Azure Data Scientist Associate Certification Guide
Published in: Dec 2021 Publisher: Packt ISBN-13: 9781800565005
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}