Reader small image

You're reading from  MLOps with Red Hat OpenShift

Product typeBook
Published inJan 2024
PublisherPackt
ISBN-139781805120230
Edition1st Edition
Right arrow
Authors (2):
Ross Brigoli
Ross Brigoli
author image
Ross Brigoli

Ross Brigoli is a consulting architect at Red Hat, where he focuses on designing and delivering solutions around microservices architecture, DevOps, and MLOps with Red Hat OpenShift for various industries. He has two decades of experience in software development and architecture.
Read more about Ross Brigoli

Faisal Masood
Faisal Masood
author image
Faisal Masood

Faisal Masood is a cloud transformation architect at AWS. Faisal's focus is to assist customers in refining and executing strategic business goals. Faisal main interests are evolutionary architectures, software development, ML lifecycle, CD and IaC. Faisal has over two decades of experience in software architecture and development.
Read more about Faisal Masood

View More author details
Right arrow

Deploying ML Models as a Service

In the previous chapter, you built a model using RHODS. In this chapter, you will start packaging and deploying your models as a service. You will see that you do not need any application development experience to expose your model. This capability enables your data science teams to be more agile in testing new models and making them available for consumption.

In this chapter, we will cover the following topics.

  • Packaging and deploying models as a service
  • Autoscaling the deployed models
  • Releasing new versions of the model
  • Securing the deployed model endpoint

Before we start, please make sure that you have completed the model-building steps and performed the configuration mentioned in the previous chapter. We’ll start by exposing our model as an HTTP service.

Packaging and deploying models as a service

To take advantage of the scalability of OpenShift workloads, the best way to run inferences against an ML model is to deploy the model as an HTTP service. This way, inference calls can be performed by invoking the HTTP endpoint of a model server Pod that is running the model. You can then create multiple replicas of the model server, allowing you to horizontally scale your model to serve more requests.

Recall that you built the wine quality prediction model in the previous chapter. The first stage of exposing the model is to save your model in an S3 bucket. RHODS provides multiple model servers that host your models and allow them to be accessed over HTTP. Think of it as an application server such as JBoss or WebLogic, which takes your Java code and enables it to be executed and accessed over standard protocols.

The model servers can serve different types of model formats, such as Intel OpenVINO, which uses the Open Neural Network Exchange...

Autoscaling the deployed models

While creating a model server, you will be presented with the option to set the number of replicas. This corresponds to the number of instances of the model servers to be created. This allows you to increase or decrease the serving capacity of your model servers. Figure 5.12 shows this option as Model server replicas:

Figure 5.12 – Add model server

Figure 5.12 – Add model server

However, with this approach, you need to decide on the number of serving instances or replicas at the time of the model server’s creation. OpenShift provides another construct where you can add an automatic scaler that increases or decreases the number of replicas of the model server based on the memory or CPU utilization of the model server instances. This construct is called horizontal pod autoscaling. This allows us to automatically scale workloads to match the demand.

Let’s see how the model server that we defined with the data science project is deployed...

Releasing new versions of the model

Having a model served as a service is not the end of the story. For the model to stay relevant and continue to deliver value to the business, you will need to keep it updated. You will continually release new versions of the model to keep up with the changing environment and to address model drift. Additionally, releasing a new version of the model may fail, and/or the new models may not perform as expected. In such cases, you may want to redeploy a newer version or roll back to the previous version of the model to avoid service disruptions. This is why it is important to not overwrite existing models and this is why they should be versioned.

To version the model, we’ll create a new pipeline:

  1. In the wines workbench, open a new pipeline editor by going to File | New | Data Science Pipeline Editor.
  2. Drag and drop the wine-training-model.ipynb and the upload-model-versioned.ipynb notebook files into the workspace. This will create...

Securing model endpoints

When exposing models as APIs, you will want to limit the access to your APIs to certain clients. You will also want to ensure that the APIs are not vulnerable to known Common Vulnerabilities and Exposures (CVE). When you store your model containers in Red Hat Quay, it will scan the containers to find out any CVE in the libraries and the runtime of your code. Quay is outside the scope of this book but there is plenty of information available on Quay. Packt’s OpenShift Multi-Cluster Management Handbook contains details about Quay, if you want to know more about it.

The API you deployed earlier in this chapter can be accessed via the HTTPS protocol. This means that OpenShift is already encrypting the traffic using the certificates that have been configured to expose the applications. The configuration of these certificates is outside the scope of this book.

The first step is to restrict access to the API through an authentication mechanism. RHODS...

Summary

In this chapter, you experienced the essential tasks surrounding MLOps. You built a complete automated pipeline that trains a model, publishes the model to the model store, and deploys it to a model-serving infrastructure, all with RHODS. You also created a pipeline that can perform rollbacks of model deployments. Finally, you implemented a canary deployment setup for your model deployments. These are the essential skills an MLOps engineer needs.

One thing to note is that RHODS is evolving fast. New versions are getting released frequently and by the time you are reading this book, the screens may look a bit different and some of the methods of configuring the platform may change a little. We suggest that when performing the exercises in this book, you use OpenShift version 4.13.

In the next chapter, we will take you through the operational tasks of MLOps. These are the activities that you must perform after deploying a model to production. They include monitoring, logging...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
MLOps with Red Hat OpenShift
Published in: Jan 2024Publisher: PacktISBN-13: 9781805120230
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (2)

author image
Ross Brigoli

Ross Brigoli is a consulting architect at Red Hat, where he focuses on designing and delivering solutions around microservices architecture, DevOps, and MLOps with Red Hat OpenShift for various industries. He has two decades of experience in software development and architecture.
Read more about Ross Brigoli

author image
Faisal Masood

Faisal Masood is a cloud transformation architect at AWS. Faisal's focus is to assist customers in refining and executing strategic business goals. Faisal main interests are evolutionary architectures, software development, ML lifecycle, CD and IaC. Faisal has over two decades of experience in software architecture and development.
Read more about Faisal Masood