Reader small image

You're reading from  Automated Machine Learning with Microsoft Azure

Product typeBook
Published inApr 2021
PublisherPackt
ISBN-139781800565319
Edition1st Edition
Right arrow
Author (1)
Dennis Michael Sawyers
Dennis Michael Sawyers
author image
Dennis Michael Sawyers

Dennis Michael Sawyers is a senior cloud solutions architect (CSA) at Microsoft, specializing in data and AI. In his role as a CSA, he helps Fortune 500 companies leverage Microsoft Azure cloud technology to build top-class machine learning and AI solutions. Prior to his role at Microsoft, he was a data scientist at Ford Motor Company in Global Data Insight and Analytics (GDIA) and a researcher in anomaly detection at the highly regarded Carnegie Mellon Auton Lab. He received a master's degree in data analytics from Carnegie Mellon's Heinz College and a bachelor's degree from the University of Michigan. More than anything, Dennis is passionate about democratizing AI solutions through automated machine learning technology.
Read more about Dennis Michael Sawyers

Right arrow

Chapter 7: Using the Many Models Solution Accelerator

Now that you have experienced building regression, classification, and forecasting models with AutoML, it's time for you to learn how to deploy and utilize those models in actual business scenarios. Before you tackle this, however, we will first introduce you to a new, very powerful solution, that is, the Many Models Solution Accelerator (MMSA).

The MMSA lets you build hundreds to thousands of machine learning (ML) models at once and easily scales to hundreds of thousands of models. It's an advanced technology at the cutting edge of ML. Not only can you build hundreds of thousands of models, but you can also use the MMSA to easily deploy them into production.

In this chapter, you will begin by installing the accelerator and understanding the various use cases to which it applies. You will then run the three sections of the accelerator notebook-by-notebook: prepping data, training models, and forecasting new data...

Technical requirements

Within this chapter, you will log in to your Azure Machine Learning studio (AMLS), open up a Jupyter notebook on a compute instance, and install the MMSA from its location on GitHub. You will then run all three pieces of the MMSA sequentially, prepping the data, training the models remotely, and forecasting the data. As such, you need to have an Azure account, a compute instance for writing Python code, and a compute cluster for remote training. The full list of requirements is as follows:

  • Access to the internet.
  • A web browser, preferably Google Chrome or Microsoft Edge Chromium.
  • A Microsoft Azure account.
  • You should have created an AMLS workspace.
  • You should have created a compute instance in Chapter 2, Getting Started with Azure Machine Learning Service.
  • You should have created the compute cluster in Chapter 2, Getting Started with Azure Machine Learning Service.
  • You should understand how to navigate to the Jupyter environment...

Installing the many models solution accelerator

The MMSA was built by Microsoft in 2019 to address the needs of a growing number of customers who wanted to train hundreds of thousands of similar ML models simultaneously. This is particularly important for product demand forecasting, where you are trying to make forecasts for many different products at many different locations.

The impetus for the accelerator is model accuracy. While you could train a single model to predict product demand across all of your product lines and all of your stores, you will find that training individual models for each combination of product and store tends to yield superior performance. This is because a multitude of factors are dependent on both your algorithm and your data. It can be very difficult for some algorithms to find meaningful patterns when you're dealing with hundreds of thousands of different products distributed across the globe.

Additionally, the same columns can have different...

Prepping data for many models

While training thousands of ML models simultaneously sounds complicated, the MMSA makes it easy. The example included in the notebooks uses the OJ Sales data you used in Chapter 6, Building an AutoML Forecasting Solution. You will prepare the data simply by opening and running 01_Data_Preparation.ipynb. By reading the instructions carefully step by step and working through the notebook slowly, you will be able to understand what each section is about.

Once you're able to understand what each section is doing and you have the OJ Sales data loaded, you will be able to load the new dataset into your Jupyter notebook. This way, by the end of this section, you will be able to load your own data into Azure, modify it for the MMSA, and master the ability to use this powerful solution.

Prepping the sample OJ dataset

To understand how the first notebook works, follow these instructions in order:

  1. Open 01_Data_Preparation.ipynb.
  2. Run all...

Training many models simultaneously

Like prepping data for many models, training many models is simply a matter of navigating to the correct notebook and running the cells. There's no custom code required, and you are simply required to change a few settings.

Like prepping data, you will first run the notebook step by step to carefully understand how it works. Once you have that understanding, you will then create a new notebook with code that uses the datasets you made from the sample data. This will benefit you tremendously, as you will understand exactly which parts of the code you need to change to facilitate your own projects.

Training the sample OJ dataset

To train many models using the OJ data and to understand the underlying process, follow these instructions step by step:

  1. From the solution-accelerator-many-models folder, click on the Automated_ML folder.
  2. From the Automated_ML folder, click on the 02_AutoML_Training_Pipeline folder.
  3. Open 02_AutoML_Training_Pipeline...

Scoring new data for many models

Scoring new data with the MMSA is a fairly straightforward task. Once you have your models trained, simply navigate to the correct notebook, change your variables to match your training notebook, and click the run button. As there are very few settings to alter compared to the training notebook, it's even easier to use with your own code.

In this section, like the others, first you will run the out-of-the-box scoring notebook with OJ Sales. Then, you will create a new notebook to score the sample data.

Scoring OJ sales data with the MMSA

To score OJ Sales data with the multiple models you've trained, follow these steps:

  1. From the solution-accelerator-many-models folder, open the Automated_ML folder.
  2. From the Automated_ML folder, open the 03_AutoML_Forecasting_Pipeline folder.
  3. Open 03_AutoML_Forecasting_Pipeline.ipynb.
  4. Run all of the cells in section 1.0. These cells set up your AMLS workspace, compute cluster...

Improving your many models results

Now that you have adapted all three of the notebooks to run your own code, you should be feeling pretty confident in your ability to use the MMSA. Still, it's pretty easy to get stuck. Many models is a complicated framework and small errors in your data can lead to errors.

Additionally, sometimes it's really hard to know what your data will look like when you are dealing with thousands of files you wish to train. Here is some good advice to follow in order to ensure you do not come to an impasse when using your own data with the MMSA:

  • Before using the accelerator, always try creating a single model first with your entire dataset. Check the performance of your model. Only use the MMSA if the single model's performance is subpar compared to your expectations or in a situation where obtaining the best accuracy is mission-critical for your project. Sometimes, the trade-off between complexity and performance isn't worth...

Summary

Advanced solutions like the MMSA are at the bleeding edge of ML and AI. It is a truly state-of-the-art technology and now it's another tool in your belt.

You've not only run all three notebooks on the OJ Sales data, but you have also converted the code to take in other datasets and understand how it works. Prepping data, training models, and forecasting the future using the MMSA are all things you have done and could do again. You may already have a use case to which you can apply it, or you may have to wait a few more years until your company is ready, but you are prepared.

Chapter 8, Choosing Real-Time versus Batch Scoring, continues your journey at the forefront of the ML world. Once you build a model in AutoML, the next step is to deploy it, and there are two options: batch versus real-time scoring. You will learn when to use batch scoring, when to use real-time scoring, and the main differences between the two. Mastering these concepts is key to successfully...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Automated Machine Learning with Microsoft Azure
Published in: Apr 2021Publisher: PacktISBN-13: 9781800565319
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Dennis Michael Sawyers

Dennis Michael Sawyers is a senior cloud solutions architect (CSA) at Microsoft, specializing in data and AI. In his role as a CSA, he helps Fortune 500 companies leverage Microsoft Azure cloud technology to build top-class machine learning and AI solutions. Prior to his role at Microsoft, he was a data scientist at Ford Motor Company in Global Data Insight and Analytics (GDIA) and a researcher in anomaly detection at the highly regarded Carnegie Mellon Auton Lab. He received a master's degree in data analytics from Carnegie Mellon's Heinz College and a bachelor's degree from the University of Michigan. More than anything, Dennis is passionate about democratizing AI solutions through automated machine learning technology.
Read more about Dennis Michael Sawyers