You're reading from Automated Machine Learning with Microsoft Azure

Product typeBook

Published inApr 2021

PublisherPackt

ISBN-139781800565319

Edition1st Edition

Tools

Azure Functions

Concepts

Machine Learning

Author (1)

Dennis Michael Sawyers

Chapter 9: Implementing a Batch Scoring Solution

You have trained regression, classification, and forecasting models with AutoML in Azure, and now it's time you learn how to put them in production and use them. Machine learning (ML) models, after all, are ultimately used to make predictions on new data, either in real time or in batches. In order to score new data points in batches in Azure, you must first create an ML pipeline.

An ML pipeline lets you run repeatable Python code in the Azure Machine Learning services (AMLS) that you can run on a schedule. While you can run any Python code using an ML pipeline, here you will learn how to build pipelines for scoring new data.

You will begin this chapter by writing a simple ML pipeline to score data using the multiclass classification model you trained on the Iris dataset in Chapter 5, Building an AutoML Classification Solution. Using the same data, you will then learn how to score new data points in parallel, enabling you...

Technical requirements

This chapter will feature a lot of coding using Jupyter notebooks within AMLS. Thus, you will need a working internet connection, an AMLS workspace, and a compute instance. ML pipelines also require a compute cluster. You will also need to have trained and registered the Iris multiclass classification model in Chapter 5, Building an AutoML Classification Solution.

The following are the prerequisites for the chapter:

Access to the internet.
A web browser, preferably Google Chrome or Microsoft Edge Chromium.
A Microsoft Azure account.
Have created an AMLS workspace.
Have created the compute-cluster compute cluster in Chapter 2, Getting Started with Azure Machine Learning Service.
Understand how to navigate to the Jupyter environment from an Azure compute instance as demonstrated in Chapter 4, Building an AutoML Regression Solution.
Have trained and registered the Iris-Multi-Classification-AutoML ML model in Chapter 5, Building...

Creating an ML pipeline

ML pipelines are Azure's solution for batch scoring ML models. You can use ML pipelines to score any model you train, including your own custom models as well as AutoML-generated models. They can only be created via code using the Azure ML Python SDK. In this section, you will code a simple pipeline to score diabetes data using the Diabetes-AllData-Regression-AutoML model you built in Chapter 4, Building an AutoML Regression Solution.

As in other chapters, you will begin by opening your compute instance and navigating to your Jupyter notebook environment. You will then create and name a new notebook. Once your notebook is created, you will build, configure, and run an ML pipeline step by step. After confirming your pipeline has run successfully, you will then publish your ML pipeline to a pipeline endpoint. Pipeline endpoints are simply URLs, web addresses that call ML pipeline runs.

The following steps deviate greatly from previous chapters. You...

Creating a parallel scoring pipeline

Standard ML pipelines work just fine for the majority of ML use cases, but when you need to score a large amount of data at once, you need a more powerful solution. That's where ParallelRunStep comes in. ParallelRunStep is Azure's answer to scoring big data in batch. When you use ParallelRunStep, you leverage all of the cores on your compute cluster simultaneously.

Say you have a compute cluster consisting of eight Standard_DS3_v2 virtual machines. Each Standard_DS3_v2 node has four cores, so you can perform 32 parallel scoring processes at once. This parallelization essentially lets you score data many times faster than if you used a single machine. Furthermore, it can easily scale vertically (increasing the size of each virtual machine in the cluster) and horizontally (increasing the node count).

This section will allow you to become a big data scientist who can score large batches of data. Here, you will again be using simulated...

Creating an AutoML training pipeline

Sometimes, it's necessary to retrain a model that you trained in AutoML. ML models can degrade over time if the relationship between your data and your target variable changes. This is true for all ML models, not just ones generated by AutoML.

Imagine, for example, that you build an ML model to predict demand for frozen pizza at a supermarket, and then one day, a famous pizza chain sets up shop next door. It's very likely that consumer buying behavior will change, and you will need to retrain the model. This is true for all ML models.

Luckily, AMLS has specialized ML pipeline steps built specifically for retraining models. In this section, we are going to use one of those steps, the AutoML step. The AutoML step lets you retrain models easily whenever you want, either with a push of a button or on a schedule.

Here, you will build a two-step ML pipeline where you will first train a model with an AutoML step and register it with...

Triggering and scheduling your ML pipelines

One of the biggest problems data scientists face is creating easy, rerunnable, production-ready code and scheduling it in an automatic, reliable manner. You've already accomplished the first part by creating your three ML pipelines. Now, it's time to learn how to do the second part.

In this section, you will first learn how to manually trigger the pipelines you've created through the GUI. Then, you will learn how to trigger the pipelines via code, both manually and on an automated schedule. This will enable you to put your ML pipelines into production, generating results on an hourly, daily, weekly, or monthly basis.

Triggering your published pipeline from the GUI

Triggering your published pipeline from the AML studio GUI is easy. However, you cannot set up an automated schedule for your ML pipelines at this time. As such, it is most useful for triggering training pipelines when you notice that your results seem off...

Summary

You have now implemented a fully automated ML batch scoring solution using an AutoML trained model. You've created pipelines that can score models, pipelines that can process big data in parallel, and pipelines that can retrain AutoML models. You can trigger them whenever you want and you can even set up an automated scoring schedule. This is no small feat, as many organizations have spent years trying to learn best practices for these tasks.

In Chapter 10, Creating End-to-End AutoML Solutions, you will cement your knowledge as you learn how to ingest data into Azure, score it with ML pipelines, and write your results to whatever location you want.

The rest of the chapter is locked

You have been reading a chapter from

Automated Machine Learning with Microsoft Azure

Published in: Apr 2021Publisher: PacktISBN-13: 9781800565319

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Dennis Michael Sawyers

Dennis Michael Sawyers is a senior cloud solutions architect (CSA) at Microsoft, specializing in data and AI. In his role as a CSA, he helps Fortune 500 companies leverage Microsoft Azure cloud technology to build top-class machine learning and AI solutions. Prior to his role at Microsoft, he was a data scientist at Ford Motor Company in Global Data Insight and Analytics (GDIA) and a researcher in anomaly detection at the highly regarded Carnegie Mellon Auton Lab. He received a master's degree in data analytics from Carnegie Mellon's Heinz College and a bachelor's degree from the University of Michigan. More than anything, Dennis is passionate about democratizing AI solutions through automated machine learning technology.
Read more about Dennis Michael Sawyers

Other recommended products

Related to this chapter

Azure Data Factory Cookbook

With the help of well-structured and practical recipes, this book will teach you how to integrate data from the cloud and on-premise. You’ll learn how to transform, clean, and consolidate data into a single data platform and get to grips with using ADF as the main ETL and orchestration tool for your data warehouse or data platform project.

BookDec 2020382 pages

Automated Machine Learning

This guide will help you to explore automated machine learning (AutoML), a rapidly growing subfield of machine learning. You’ll learn how you can use AutoML to fully automate the machine learning process even if you’re not an expert, and in turn increase your productivity drastically.

BookFeb 2021312 pages

Mastering Azure Machine Learning

This book will help you learn how to build a scalable end-to-end machine learning pipeline in Azure from experimentation and training to optimization and deployment. By the end of this book, you will learn to build complex distributed systems and scalable cloud infrastructure using powerful machine learning algorithms to compute insights.

BookApr 2020436 pages

Engineering MLOps

Get to grips with ML lifecycle management and MLOps implementation for your organization. This book will give you comprehensive insights into MLOps coupled with real-world examples in Azure that will teach you how to write programs, train robust and scalable ML models, and build ML pipelines to train, deploy, and monitor models securely in production.

BookApr 2021370 pages

Limitless Analytics with Azure Synapse

This book helps you understand the basic concepts and techniques of using Azure Synapse step-by-step. You'll gradually gain the skills you need to work with data and develop analytics solutions using the Azure analytics platform even with no prior knowledge of Azure.

BookJun 2021392 pages

Hands-On Data Warehousing with Azure Data Factory

Azure Data Factory (ADF) is a Microsoft Azure PaaS solution which supports data movement between many on premises and cloud data sources. This book covers custom tailored tutorials to help you develop , maintain and troubleshoot data movement processes and environments using Azure Data Factory V2 and SQL Server Integration Services 2017

BookMay 2018284 pages

Cloud Analytics with Microsoft Azure

Cloud Analytics with Microsoft Azure is an end-to-end guide to processing and analyzing big data using a range of Microsoft Azure features. This book covers everything you need to build your own data warehouse and learn numerous techniques to gain useful insights by analyzing big data.

BookNov 2019242 pages

Hands-On Machine Learning with Azure

This book will teach you how advanced machine learning can be performed in the cloud in a very cheap way. You will learn more about Azure ML processes as an enterprise-ready methodology. By the end of this book, you will implement machine learning and artificial intelligence concepts in your model to solve real-world problems.

BookOct 2018340 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages