Reader small image

You're reading from  Serverless Machine Learning with Amazon Redshift ML

Product typeBook
Published inAug 2023
Reading LevelBeginner
PublisherPackt
ISBN-139781804619285
Edition1st Edition
Languages
Right arrow
Authors (4):
Debu Panda
Debu Panda
author image
Debu Panda

Debu Panda, a Senior Manager, Product Management at AWS, is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world. Debu has published numerous articles on analytics, enterprise Java, and databases and has presented at multiple conferences such as re:Invent, Oracle Open World, and Java One. He is lead author of the EJB 3 in Action (Manning Publications 2007, 2014) and Middleware Management (Packt, 2009).
Read more about Debu Panda

Phil Bates
Phil Bates
author image
Phil Bates

Phil Bates is a Senior Analytics Specialist Solutions Architect at AWS. He has more than 25 years of experience implementing large-scale data warehouse solutions. He is passionate about helping customers through their cloud journey and leveraging the power of ML within their data warehouse.
Read more about Phil Bates

Bhanu Pittampally
Bhanu Pittampally
author image
Bhanu Pittampally

Bhanu Pittampally is Analytics Specialist Solutions Architect at Amazon Web Services. His background is in data and analytics and is in the field for over 16 years. He currently lives in Frisco, TX with his wife Kavitha and daughters Vibha and Medha.
Read more about Bhanu Pittampally

Sumeet Joshi
Sumeet Joshi
author image
Sumeet Joshi

Sumeet Joshi is an Analytics Specialist Solutions Architect based out of New York. He specializes in building large-scale data warehousing solutions. He has over 17 years of experience in the data warehousing and analytical space.
Read more about Sumeet Joshi

View More author details
Right arrow

Bringing Your Own Models for Database Inference

In this book, we’ve covered the process of training models natively using Redshift Machine Learning (Redshift ML). However, there may be instances where you need to utilize models built outside of Redshift. To address this, Redshift ML offers the Bring Your Own Model (BYOM) feature, allowing users to integrate their Amazon SageMaker machine learning models with Amazon Redshift. This feature facilitates making predictions and performing other machine learning tasks on data stored in the warehouse, without requiring data movement.

BYOM offers two approaches: local inference and remote inference. In this chapter, we’ll delve into the workings of BYOM and explore the various options available for creating and integrating BYOM. You’ll be guided through the process of building a machine learning model in Amazon SageMaker, and subsequently, employing Redshift ML’s BYOM feature to bring that model to Redshift. Moreover...

Technical requirements

This chapter requires a web browser and access to the following:

  • An AWS account
  • An Amazon Redshift Serverless endpoint
  • An Amazon SageMaker notebook
  • Amazon Redshift Query Editor v2
  • Completing the Getting started with Amazon Redshift Serverless section in Chapter 1

You can find the code used in this chapter here:

https://github.com/PacktPublishing/Serverless-Machine-Learning-with-Amazon-Redshift

The data files required for this chapter are located in a public S3 bucket: s3://packt-serverless-ml-redshift/.

Let’s begin!

Benefits of BYOM

With Amazon Redshift ML, you can use an existing ML model built in Amazon SageMaker and use it in Redshift without having to retrain it. To use BYOM, you need to provide model artifacts or a SageMaker endpoint, which takes a batch of data and returns predictions. BYOM is useful in cases where a machine learning model is not yet available in Redshift ML, for example, at the time of writing this book, a Random Cut Forest model is not yet available in Redshift ML, so you can build this model in SageMaker and easily bring it to Redshift and then use it against the data stored in Redshift.

Here are some specific benefits of using Redshift ML with your own ML model:

  • Improved efficiency: By using an existing ML model, you can save time and resources that would otherwise be spent on training a new model
  • Easy integration: Redshift ML makes it easy to integrate your ML model into your data pipeline, allowing you to use it for real-time predictions or batch predictions...

Creating the BYOM local inference model

With BYOM local inference, the machine learning model and its dependencies are packaged into a group of files and deployed to Amazon Redshift where the data is stored, allowing users to make predictions on the stored data. Model artifacts and their dependencies are created when a model is trained and created on the Amazon SageMaker platform. By deploying the model directly onto the Redshift service, you are not moving the data over the network to another service. Local inference can be useful for scenarios where the data is sensitive or requires low latency predictions.

Let’s start working on creating the BYOM local inference model.

Creating a local inference model

To create the BYOM local inference model, the first step involves training and validating an Amazon SageMaker model. For this purpose, we will train and validate an XGBoost linear regression machine learning model on Amazon SageMaker. Follow the instructions found here...

BYOM using a SageMaker endpoint for remote inference

In this section, we will explore how to create a BYOM remote inference for an Amazon SageMaker Random Cut Forest model. This means you are bringing your own machine learning model, which is trained on data outside of Redshift, and using it to make predictions on data stored in a Redshift cluster using an endpoint. In this method, to use BYOM for remote inference, a machine learning model is trained, an endpoint is created in Amazon SageMaker, and then the endpoint is accessed from within a Redshift query using SQL functions provided by the Amazon Redshift ML extension.

This method is useful when Redshift ML does not natively support models, for example, a Random Cut Forest model. You can read more about Random Cut Forest here: https://tinyurl.com/348v8nnw.

To demonstrate this feature, you will first need to follow the instructions found in this notebook (https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_amazon_algorithms...

Summary

In this chapter, we discussed the benefits and use cases of Amazon Redshift ML BYOM for local and remote inference. We created two SageMaker models and then imported them into Redshift ML as local inference and remote inference model types. We loaded test datasets in Redshift and then we ran the prediction functions and validated both types. This demonstrates how Redshift simplifies and empowers the business community to perform inference on new data using models created outside. This method speeds up the delivery of machine learning models created outside of Redshift to the data warehouse team.

In the next chapter, you are going to learn about Amazon Forecast, which enables you to perform forecasting using Redshift ML.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Serverless Machine Learning with Amazon Redshift ML
Published in: Aug 2023Publisher: PacktISBN-13: 9781804619285
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (4)

author image
Debu Panda

Debu Panda, a Senior Manager, Product Management at AWS, is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world. Debu has published numerous articles on analytics, enterprise Java, and databases and has presented at multiple conferences such as re:Invent, Oracle Open World, and Java One. He is lead author of the EJB 3 in Action (Manning Publications 2007, 2014) and Middleware Management (Packt, 2009).
Read more about Debu Panda

author image
Phil Bates

Phil Bates is a Senior Analytics Specialist Solutions Architect at AWS. He has more than 25 years of experience implementing large-scale data warehouse solutions. He is passionate about helping customers through their cloud journey and leveraging the power of ML within their data warehouse.
Read more about Phil Bates

author image
Bhanu Pittampally

Bhanu Pittampally is Analytics Specialist Solutions Architect at Amazon Web Services. His background is in data and analytics and is in the field for over 16 years. He currently lives in Frisco, TX with his wife Kavitha and daughters Vibha and Medha.
Read more about Bhanu Pittampally

author image
Sumeet Joshi

Sumeet Joshi is an Analytics Specialist Solutions Architect based out of New York. He specializes in building large-scale data warehousing solutions. He has over 17 years of experience in the data warehousing and analytical space.
Read more about Sumeet Joshi