Reader small image

You're reading from  Automated Machine Learning with Microsoft Azure

Product typeBook
Published inApr 2021
PublisherPackt
ISBN-139781800565319
Edition1st Edition
Right arrow
Author (1)
Dennis Michael Sawyers
Dennis Michael Sawyers
author image
Dennis Michael Sawyers

Dennis Michael Sawyers is a senior cloud solutions architect (CSA) at Microsoft, specializing in data and AI. In his role as a CSA, he helps Fortune 500 companies leverage Microsoft Azure cloud technology to build top-class machine learning and AI solutions. Prior to his role at Microsoft, he was a data scientist at Ford Motor Company in Global Data Insight and Analytics (GDIA) and a researcher in anomaly detection at the highly regarded Carnegie Mellon Auton Lab. He received a master's degree in data analytics from Carnegie Mellon's Heinz College and a bachelor's degree from the University of Michigan. More than anything, Dennis is passionate about democratizing AI solutions through automated machine learning technology.
Read more about Dennis Michael Sawyers

Right arrow

Chapter 6: Building an AutoML Forecasting Solution

Having built an AutoML regression and classification solution, you are now ready to tackle a more complicated problem: forecasting. Forecasting is inherently a much more complex technique than either classification or regression. Those two machine learning (ML) problem types assume that time is irrelevant. Regardless of how much time passes, your diabetes model will always be able to accurately predict whose condition worsens over time. Your Titanic model will always be able to predict who lives and who dies. In contrast, with forecasting problems, you are always trying to predict future events based on past events; time will always be a factor in your model.

You will begin this chapter similarly to how you began Chapter 4, Building an AutoML Regression Solution, and Chapter 5, Building an AutoML Classification Solution. First, you will navigate to your Jupyter environment, load in data, train a model, and evaluate the results....

Technical requirements

Like Chapter 4, Building an AutoML Regression Solution, you will be creating and training models with Python code in a Jupyter notebook running on an Azure compute instance. As such, you will require a working internet connection, an Azure Machine Learning Service (AMLS) workspace, and a compute instance. Likewise, you will need to have a working compute cluster to train models remotely while you continue to work on your notebook. The full list of requirements is as follows:

  • Access to the internet.
  • A web browser, preferably Google Chrome or Microsoft Edge Chromium.
  • A Microsoft Azure account.
  • You should have created an AMLS workspace.
  • You should have created a compute instance.
  • You should have created the compute cluster in Chapter 2, Getting Started with Azure Machine Learning Service.
  • You should understand how to navigate to the Jupyter environment from an Azure compute instance as demonstrated in Chapter 4, Building an AutoML...

Prepping data for AutoML forecasting

Forecasting is very different from either classification or regression. ML models for regression or classification predict some output based on some input data. ML models for forecasting, on the other hand, predict a future state based on patterns found in the past. This means that there are key time-related details you need to pay attention to while shaping your data.

For this exercise, you are going to use the OJ Sales Simulated Data Azure Open Dataset for forecasting. Similar to the Diabetes Sample Azure Open Dataset you used for regression, OJ Sales Simulated Data is available simply by having an Azure account. You will use this data to create a model to predict future orange juice sales across different brands and stores.

There is one additional key difference; OJ Sales Simulated Data is a file dataset instead of a tabular dataset. While tabular datasets consist of one file containing columns and rows, file datasets consist of many files...

Training an AutoML forecasting model

Training an AutoML forecasting is most similar to training an AutoML regression model. Like regression and unlike classification, you are trying to predict a number. Unlike regression, this number is always in the future based on patterns found in the past. Also, unlike regression, you can predict a whole series of numbers into the future. For example, you can choose to predict one month out into the future or you can choose to predict 6, 12, 18, or even 24 months out.

Important tip

The further out you try to predict, the less accurate your forecasting model will be.

Follow the same steps you have seen in Chapter 4, Building an AutoML Regression Solution, and Chapter 5, Building an AutoML Classification Solution. First, begin by setting a name for your experiment. Then, set your target column and your AutoML configurations.

For forecasting, there is an additional step: setting your forecasting parameters. This is where you will set things...

Registering your trained forecasting model

The code to register forecasting models is identical to the code you used in Chapter 4, Building an AutoML Regression Solution, in order to register your regression model, and in Chapter 5, Building an AutoML Classification Solution, in order to register your classification models. Always register new models, as you will use them in either real-time scoring endpoints or batch execution inference pipelines depending on your business scenario. Likewise, always add tags and descriptions for easier tracking:

  1. First, give your model a name, a description, and some tags. Tags let you easily search for models, so think carefully as you implement them:
    description = 'Best AutoML Forecasting Run using OJ Sales Sample Data.' 
    tags = {'project' : "OJ Sales", "creator" : "your name"} 
    model_name = 'OJ-Sales-Sample-Forecasting-AutoML' 
  2. Next, register your model to your AMLS workspace...

Fine-tuning your AutoML forecasting model

In this section, you will first review tips and tricks for improving your AutoML forecasting models and then review the algorithms used by AutoML for forecasting.

Improving AutoML forecasting models

Forecasting is very easy to get wrong. It's easy to produce a model that seems to work in development, but fails to make accurate predictions once deployed to production. Many data scientists, even experienced ones, make mistakes. While AutoML will help you avoid some of the common mistakes, there are others that require you to exercise caution. In order to sidestep these pitfalls and make the best models possible, follow these tips and tricks:

  • Any feature column that you train with has to be available in the future when you make a prediction. With OJ Sales Sample, this means that, if you want to predict the quantity of sales 6 weeks out and include price as an input variable, you need to know the price of each product 6 weeks...

Summary

You have now successfully trained all three types of AutoML models – classification, regression, and forecasting. Not only can you train a simple forecasting model, but you also know how to improve models with the various forecasting parameters and how to build high-performing baseline models with ARIMA and Prophet.

Moreover, you've acquired a lot of knowledge regarding how forecasting differs from other problems and how to avoid common pitfalls. By utilizing the forecast horizon feature wisely, you can forecast days, months or years into the future, and now it's time to add a powerful tool to your repertoire.

In Chapter 7, Using the Many Models Solution Accelerator, you will be able to build individual models for each time series grain. Instead of building one forecasting model, you can build thousands of models all at the same time and score them as if they were one model. You will find that this approach can vastly enhance your model's performance...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Automated Machine Learning with Microsoft Azure
Published in: Apr 2021Publisher: PacktISBN-13: 9781800565319
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Dennis Michael Sawyers

Dennis Michael Sawyers is a senior cloud solutions architect (CSA) at Microsoft, specializing in data and AI. In his role as a CSA, he helps Fortune 500 companies leverage Microsoft Azure cloud technology to build top-class machine learning and AI solutions. Prior to his role at Microsoft, he was a data scientist at Ford Motor Company in Global Data Insight and Analytics (GDIA) and a researcher in anomaly detection at the highly regarded Carnegie Mellon Auton Lab. He received a master's degree in data analytics from Carnegie Mellon's Heinz College and a bachelor's degree from the University of Michigan. More than anything, Dennis is passionate about democratizing AI solutions through automated machine learning technology.
Read more about Dennis Michael Sawyers