You're reading from MLOps with Red Hat OpenShift

Product typeBook

Published inJan 2024

PublisherPackt

ISBN-139781805120230

Edition1st Edition

Concepts

Machine Learning

Authors (2):

Ross Brigoli

Faisal Masood

View More author details

Deploying ML Models as a Service

In the previous chapter, you built a model using RHODS. In this chapter, you will start packaging and deploying your models as a service. You will see that you do not need any application development experience to expose your model. This capability enables your data science teams to be more agile in testing new models and making them available for consumption.

In this chapter, we will cover the following topics.

Packaging and deploying models as a service
Autoscaling the deployed models
Releasing new versions of the model
Securing the deployed model endpoint

Before we start, please make sure that you have completed the model-building steps and performed the configuration mentioned in the previous chapter. We’ll start by exposing our model as an HTTP service.

Packaging and deploying models as a service

To take advantage of the scalability of OpenShift workloads, the best way to run inferences against an ML model is to deploy the model as an HTTP service. This way, inference calls can be performed by invoking the HTTP endpoint of a model server Pod that is running the model. You can then create multiple replicas of the model server, allowing you to horizontally scale your model to serve more requests.

Recall that you built the wine quality prediction model in the previous chapter. The first stage of exposing the model is to save your model in an S3 bucket. RHODS provides multiple model servers that host your models and allow them to be accessed over HTTP. Think of it as an application server such as JBoss or WebLogic, which takes your Java code and enables it to be executed and accessed over standard protocols.

The model servers can serve different types of model formats, such as Intel OpenVINO, which uses the Open Neural Network Exchange...

Autoscaling the deployed models

While creating a model server, you will be presented with the option to set the number of replicas. This corresponds to the number of instances of the model servers to be created. This allows you to increase or decrease the serving capacity of your model servers. Figure 5.12 shows this option as Model server replicas:

Figure 5.12 – Add model server

However, with this approach, you need to decide on the number of serving instances or replicas at the time of the model server’s creation. OpenShift provides another construct where you can add an automatic scaler that increases or decreases the number of replicas of the model server based on the memory or CPU utilization of the model server instances. This construct is called horizontal pod autoscaling. This allows us to automatically scale workloads to match the demand.

Let’s see how the model server that we defined with the data science project is deployed...

Releasing new versions of the model

Having a model served as a service is not the end of the story. For the model to stay relevant and continue to deliver value to the business, you will need to keep it updated. You will continually release new versions of the model to keep up with the changing environment and to address model drift. Additionally, releasing a new version of the model may fail, and/or the new models may not perform as expected. In such cases, you may want to redeploy a newer version or roll back to the previous version of the model to avoid service disruptions. This is why it is important to not overwrite existing models and this is why they should be versioned.

To version the model, we’ll create a new pipeline:

In the wines workbench, open a new pipeline editor by going to File | New | Data Science Pipeline Editor.
Drag and drop the wine-training-model.ipynb and the upload-model-versioned.ipynb notebook files into the workspace. This will create...

Securing model endpoints

When exposing models as APIs, you will want to limit the access to your APIs to certain clients. You will also want to ensure that the APIs are not vulnerable to known Common Vulnerabilities and Exposures (CVE). When you store your model containers in Red Hat Quay, it will scan the containers to find out any CVE in the libraries and the runtime of your code. Quay is outside the scope of this book but there is plenty of information available on Quay. Packt’s OpenShift Multi-Cluster Management Handbook contains details about Quay, if you want to know more about it.

The API you deployed earlier in this chapter can be accessed via the HTTPS protocol. This means that OpenShift is already encrypting the traffic using the certificates that have been configured to expose the applications. The configuration of these certificates is outside the scope of this book.

The first step is to restrict access to the API through an authentication mechanism. RHODS...

Summary

In this chapter, you experienced the essential tasks surrounding MLOps. You built a complete automated pipeline that trains a model, publishes the model to the model store, and deploys it to a model-serving infrastructure, all with RHODS. You also created a pipeline that can perform rollbacks of model deployments. Finally, you implemented a canary deployment setup for your model deployments. These are the essential skills an MLOps engineer needs.

One thing to note is that RHODS is evolving fast. New versions are getting released frequently and by the time you are reading this book, the screens may look a bit different and some of the methods of configuring the platform may change a little. We suggest that when performing the exercises in this book, you use OpenShift version 4.13.

In the next chapter, we will take you through the operational tasks of MLOps. These are the activities that you must perform after deploying a model to production. They include monitoring, logging...

The rest of the chapter is locked

You have been reading a chapter from

MLOps with Red Hat OpenShift

Published in: Jan 2024Publisher: PacktISBN-13: 9781805120230

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Authors (2)

Ross Brigoli

Ross Brigoli is a consulting architect at Red Hat, where he focuses on designing and delivering solutions around microservices architecture, DevOps, and MLOps with Red Hat OpenShift for various industries. He has two decades of experience in software development and architecture.
Read more about Ross Brigoli

Faisal Masood

Faisal Masood is a cloud transformation architect at AWS. Faisal's focus is to assist customers in refining and executing strategic business goals. Faisal main interests are evolutionary architectures, software development, ML lifecycle, CD and IaC. Faisal has over two decades of experience in software architecture and development.
Read more about Faisal Masood

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages