You're reading from Azure Data Scientist Associate Certification Guide

Product typeBook

Published inDec 2021

Reading LevelBeginner

PublisherPackt

ISBN-139781800565005

Edition1st Edition

Languages

Python

Tools

Azure Functions

Concepts

Machine Learning

Authors (2):

Andreas Botsikas

Michael Hlobil

View More author details

Preface

This book helps you acquire practical knowledge about machine learning experimentation on Azure. It covers everything you need to know and understand to become a certified Azure Data Scientist Associate.

The book starts with an introduction to data science, making sure you are familiar with the terminology used throughout the book. You then move into the Azure Machine Learning (AzureML) workspace, your working area for the rest of the book. You will discover the studio interface and manage the various components, like the data stores and the compute clusters.

You will then focus on no-code, and low-code experimentation. You will discover the Automated ML wizard, which helps you to locate and deploy optimal models for your dataset. You will also learn how to run end-to-end data science experiments using the designer provided in AzureML studio.

You will then deep dive into the code first data science experimentation. You will explore the AzureML Software Development Kit (SDK) for Python and learn how to create experiments and publish models using code. You will learn how to use powerful computer clusters to scale up and out your machine learning jobs. You will learn how to optimize your model’s hyperparameters using Hyperdrive. Then you will learn how to use responsible AI tools to interpret and debug your models. Once you have a trained model, you will learn to operationalize it for batch or real-time inferences and how you can monitor it in production.

With this knowledge, you will have a good understanding of the Azure Machine Learning platform and you will be able to clear the DP100 exam with flying colors.

Who this book is for

The book targets two audiences: developers who seek to infuse their applications with AI capabilities and data scientists who want to scale their ML experiments in the Azure cloud. Basic knowledge of Python is needed to follow the code samples present in the book. Some experience in training ML models in Python with common frameworks such as scikit-learn will help you understand the content more easily.

What this book covers

Chapter 1, An Overview of Modern Data Science, provides you with the terminology used throughout the book.

Chapter 2, Deploying Azure Machine Learning Workspace Resources, helps you understand the deployment options for an Azure Machine Learning (AzureML) workspace.

Chapter 3, Azure Machine Learning Studio Components, provides an overview of the studio web interface you will be using to conduct your data science experiments.

Chapter 4, Configuring the Workspace, helps you understand how to provision computational resources and connect to data sources that host your datasets.

Chapter 5, Letting the Machines Do the Model Training, guides you on your first Automated Machine Learning (AutoML) experiment and how to deploy the best-trained model as a web endpoint through the studio’s wizards.

Chapter 6, Visual Model Training and Publishing, helps you author a training pipeline through the studio’s designer experience. You will learn how to operationalize the trained model through a batch or a real-time pipeline by promoting the trained pipeline within the designer.

Chapter 7, The AzureML Python SDK, gets you started on the code-first data science experimentation. You will understand how the AzureML Python SDK is structured, and you will learn how to manage AzureML resources like compute clusters with code.

Chapter 8, Experimenting with Python Code, helps you train your first machine learning model with code. It guides you on how to track model metrics and scale-out your training efforts to bigger compute clusters.

Chapter 9, Optimizing the ML Model, shows you how to optimize your machine learning model with Hyperparameter tuning and helps you discover the best model for your dataset by kicking off an AutoML experiment with code.

Chapter 10, Understanding Model Results, introduces you to the concept of responsible AI and deep dives into the tools that allow you to interpret your models’ predictions, analyze the errors that your models are prone to, and detect potential fairness issues.

Chapter 11, Working with Pipelines, guides you on authoring repeatable processes by defining multi-step pipelines using the AzureML Python SDK.

Chapter 12, Operationalizing Models with Code, helps you register your trained models and operationalize them through real-time web endpoints or batch parallel processing pipelines.

To get the most out of this book

This book tries to provide you with everything you need to learn. The Further reading section of each chapter contains links to pages that will help you deep dive, into topics that are peripheral to the contents of this book. It will help if you have some basic familiarity with the Azure portal and have read some Python code in the past.

In this book, we guide you to use the Notebooks experience available within the AzureML studio. If you want to execute the same code on your workstation instead of the cloud-based experience, you will need a Python environment to run Jupyter notebooks. The easiest way to run Jupyter notebooks on your workstation is through VSCode, a free cross-platform editor with fantastic Python support. You will also need to install Git in your workstation to clone the book’s GitHub repository.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book's GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

If you face any issue executing the code, ensure that you have cloned the latest version from the GitHub repository. If the problem persists, feel free to open a GitHub issue to describe the issue you are facing and help you solve it.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Azure-Data-Scientist-Associate-Certification-Guide. If there's an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781800565005_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "You can also change the autogenerated name of the pipeline you are designing. Rename the current pipeline to test-pipeline."

A block of code is set as follows:

from azureml.train.hyperdrive import GridParameterSampling
from azureml.train.hyperdrive import choice
param_sampling = GridParameterSampling( {
        "a": choice(0.01, 0.5),
        "b": choice(10, 100)
    }
)

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

from azureml.core import Workspace
ws = Workspace.from_config()
loans_ds = ws.datasets['loans']
compute_target = ws.compute_targets['cpu-sm-cluster']

Any command-line input or output is written as follows:

az group create --name my-name-rg --location westeurope

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: "Navigate to the Author | Notebooks section of your AzureML Studio web interface."

Tips or important notes

Run numbers may be different in your executions. Every time you execute the cells, a new run number is created, continuing from the previous number. So, if you execute code that performs one hyperdrive run with 20 child runs, the last child run will be run 21. The next time you execute the same code, the hyperdrive run will start from run 22, and the last child will be run 42. The run numbers referred to in this section are the ones shown in the various figures, and it is normal to observe differences, especially if you had to rerun a couple of cells.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

The rest of the chapter is locked

You have been reading a chapter from

Azure Data Scientist Associate Certification Guide

Published in: Dec 2021Publisher: PacktISBN-13: 9781800565005

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Andreas Botsikas

Andreas Botsikas is an experienced advisor working in the software industry. He has worked in the finance sector, leading highly efficient DevOps teams, and architecting and building high-volume transactional systems. He then traveled the world, building AI-infused solutions with a group of engineers and data scientists. Currently, he works as a trusted advisor for customers onboarding into Azure, de-risking and accelerating their cloud journey. He is a strong engineering professional with a Doctor of Philosophy (Ph.D.) in resource optimization with artificial intelligence from the National Technical University of Athens.
Read more about Andreas Botsikas

Michael Hlobil

Michael Hlobil is an experienced architect focused on quickly understanding customers' business needs, with over 25 years of experience in IT pitfalls and successful projects, and is dedicated to creating solutions based on the Microsoft Platform. He has an MBA in Computer Science and Economics (from the Technical University and the University of Vienna) and an MSc (from the ESBA) in Systemic Coaching. He was working on advanced analytics projects in the last decade, including massive parallel systems and Machine Learning systems. He enjoys working with customers and supporting the journey to the cloud.
Read more about Michael Hlobil

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Azure Data Scientist Associate Certification Guide

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Get in touch

Share Your Thoughts

Unlock this book and the full library FREE for 7 days

Authors (2)

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook