Reader small image

You're reading from  Serverless Machine Learning with Amazon Redshift ML

Product typeBook
Published inAug 2023
Reading LevelBeginner
PublisherPackt
ISBN-139781804619285
Edition1st Edition
Languages
Right arrow
Authors (4):
Debu Panda
Debu Panda
author image
Debu Panda

Debu Panda, a Senior Manager, Product Management at AWS, is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world. Debu has published numerous articles on analytics, enterprise Java, and databases and has presented at multiple conferences such as re:Invent, Oracle Open World, and Java One. He is lead author of the EJB 3 in Action (Manning Publications 2007, 2014) and Middleware Management (Packt, 2009).
Read more about Debu Panda

Phil Bates
Phil Bates
author image
Phil Bates

Phil Bates is a Senior Analytics Specialist Solutions Architect at AWS. He has more than 25 years of experience implementing large-scale data warehouse solutions. He is passionate about helping customers through their cloud journey and leveraging the power of ML within their data warehouse.
Read more about Phil Bates

Bhanu Pittampally
Bhanu Pittampally
author image
Bhanu Pittampally

Bhanu Pittampally is Analytics Specialist Solutions Architect at Amazon Web Services. His background is in data and analytics and is in the field for over 16 years. He currently lives in Frisco, TX with his wife Kavitha and daughters Vibha and Medha.
Read more about Bhanu Pittampally

Sumeet Joshi
Sumeet Joshi
author image
Sumeet Joshi

Sumeet Joshi is an Analytics Specialist Solutions Architect based out of New York. He specializes in building large-scale data warehousing solutions. He has over 17 years of experience in the data warehousing and analytical space.
Read more about Sumeet Joshi

View More author details
Right arrow

Time-Series Forecasting in Your Data Warehouse

In previous chapters, we discussed how you can use Amazon Redshift Machine Learning (ML) to easily create, train, and apply ML models using familiar SQL commands. We talked about how we can use supervised learning algorithms for classification or regression problems to predict a certain outcome. In this chapter, we will talk about how you can use your data in Amazon Redshift to forecast a certain future event using Amazon Forecast.

This chapter will introduce you to time-series forecasting on Amazon Redshift using Amazon Forecast (https://aws.amazon.com/forecast/), a fully managed time-series forecasting service, using SQL, and without moving your data or learning new skills. We will guide you through the following topics:

  • Forecasting and time-series data
  • What is Amazon Forecast?
  • Configuration and security
  • Creating forecasting models using Redshift ML

Technical requirements

This chapter requires a web browser and access to the following:

  • An AWS account
  • Amazon Redshift
  • Amazon Redshift query editor v2

You can find the code used in this chapter here: https://github.com/PacktPublishing/Serverless-Machine-Learning-with-Amazon-Redshift/blob/main/CodeFiles/chapter12/chapter-12.sql.

Forecasting and time-series data

Forecasting is a way of estimating future events, which involves analyzing historical data and past patterns to derive a possible outcome in the future. For example, based on historical data, a business can predict their sales revenue or identify what will happen in the next time period.

Forecasting plays a valuable role in guiding businesses to make informed decisions about their operations and priorities. Many organizations rely on data warehouses such as Amazon Redshift to perform deep analytics on vast amounts of historical and current data, enabling them to drive their business goals and gauge future success. Acting as a planning tool, forecasting helps enterprises prepare for future uncertainties by leveraging past patterns, with the underlying principle that what happened in the past will likely recur in the future. These predictions are based on analyzing observations over time within the given timeframe.

Here are some examples of how...

What is Amazon Forecast?

Amazon Forecast, like Amazon Redshift ML, requires no ML experience to use. Time-series forecasts are generated using various ML and statistical algorithms based on historical data. As a user, you simply send data to Amazon Forecast and it will examine the data and automatically identify what is meaningful and produces a forecasting model.

With Amazon Redshift ML, you can leverage Amazon Forecast to create and train forecasting models from your time-series data and use these models to generate forecasts. For forecasting, we require a target time-series dataset. In target time-series forecasting, we predict the future value of a variable using the past data or previous values, which is often called univariate time series because the data is sequential over equal time increments. Currently, Redshift ML supports target time-series datasets with a custom domain. The dataset in your data warehouse must contain the frequency or interval at which you capture your...

Configuration and security

As Amazon Forecast is a separate fully managed service, you will need to create or modify your IAM role to include access permissions for your serverless endpoint or Redshift cluster. Additionally, you should configure a trust relationship for Amazon Forecast (forecast.amazonaws.com) in the IAM role to enable the necessary permissions.

You can use the AmazonForecastFullAccess managed policy, which grants full access to Amazon Forecast and all of the supported operations. You can attach this policy to your default role but, in your production environments, you must follow the principle of least-privilege permissions. You may use more restrictive permissions, such as the following:

 {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0...

Creating forecasting models using Redshift ML

Currently, if you have to perform forecasting in your data warehouse, you need to export the dataset into external systems and then apply forecasting algorithms to create output datasets and then import them back into the data warehouse for your presentation layer or further analysis. With Redshift ML’s integration with Amazon Forecast, you don’t have to perform all these steps. You can now create the forecasting models right on your dataset within your data warehouse.

In Chapter 5, we talked about the basic CREATE MODEL syntax and its constructs. Let’s take a look at the CREATE MODEL syntax for forecasting:

CREATE MODEL forecast_model_name
FROM { table_name | ( select_query ) }
TARGET column_name
IAM_ROLE { default | 'arn:aws:iam::<AWS account-id>:role/<role-name>' }
AUTO ON MODEL_TYPE FORECAST
[ OBJECTIVE optimization_metric ]
SETTINGS (S3_BUCKET 'bucket',
   &...

Summary

In this chapter, we discussed how you can use Redshift ML to generate forecasting models using Amazon Forecast by creating the model for Forecast Model_Type. You learned about what forecasting is and how time-series data is used to generate different models for different quantiles. We also looked at different quantiles and talked briefly about different optimization metrics.

We showed how forecast models can be used to predict the future quantity sale for a retailer use case and how they can be used to balance the effect of over-forecasting and under-forecasting.

In the next chapter, we will look at operational and optimization considerations.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Serverless Machine Learning with Amazon Redshift ML
Published in: Aug 2023Publisher: PacktISBN-13: 9781804619285
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (4)

author image
Debu Panda

Debu Panda, a Senior Manager, Product Management at AWS, is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world. Debu has published numerous articles on analytics, enterprise Java, and databases and has presented at multiple conferences such as re:Invent, Oracle Open World, and Java One. He is lead author of the EJB 3 in Action (Manning Publications 2007, 2014) and Middleware Management (Packt, 2009).
Read more about Debu Panda

author image
Phil Bates

Phil Bates is a Senior Analytics Specialist Solutions Architect at AWS. He has more than 25 years of experience implementing large-scale data warehouse solutions. He is passionate about helping customers through their cloud journey and leveraging the power of ML within their data warehouse.
Read more about Phil Bates

author image
Bhanu Pittampally

Bhanu Pittampally is Analytics Specialist Solutions Architect at Amazon Web Services. His background is in data and analytics and is in the field for over 16 years. He currently lives in Frisco, TX with his wife Kavitha and daughters Vibha and Medha.
Read more about Bhanu Pittampally

author image
Sumeet Joshi

Sumeet Joshi is an Analytics Specialist Solutions Architect based out of New York. He specializes in building large-scale data warehousing solutions. He has over 17 years of experience in the data warehousing and analytical space.
Read more about Sumeet Joshi