You're reading from MLOps with Red Hat OpenShift

Product typeBook

Published inJan 2024

PublisherPackt

ISBN-139781805120230

Edition1st Edition

Concepts

Machine Learning

Authors (2):

Ross Brigoli

Faisal Masood

View More author details

Building Machine Learning Models with OpenShift

In the previous chapter, you installed and configured OpenShift to power your machine learning (ML) project life cycle. In this chapter, you will configure the platform components required for model development. This chapter will equip you with what is available on the OpenShift platform for building ML models and how your team can leverage it. Please ensure that you have completed the setup mentioned in the previous chapter before starting this chapter.

This is the first stage of the ML development life cycle, which we presented in Chapter 2. In this chapter, you will see how easy it is for you and your team to start building with the technology provided by Red Hat OpenShift for Data Science (RHODS).

We will cover the following topics:

Using Jupyter Notebooks in OpenShift
Using ML frameworks in OpenShift
Using GPU acceleration for model training
Building custom notebooks

Technical requirements

In this chapter, you’ll need to use this book’s GitHub repository. This can be found at https://github.com/PacktPublishing/MLOps-with-Red-Hat-OpenShift. The files that you will need in this chapter are located in the chapter3 directory. You will also write basic Python code to validate the deployments and configurations.

Using Jupyter Notebooks in OpenShift

Jupyter Notebooks is the de facto standard environment for data scientists and data engineers to analyze data and build ML models. Since the notebooks provided by the platform run as containers, you will see how your team can start quickly and consistently by adopting the platform. The platform provides a rapid way to develop, train, and test ML models and deploy them onto production. In the ODS platform, the Jupyter Notebooks environments are referred to as workbenches. You will learn about workbenches later in this section. But first, we need to learn how to create these environments.

We’ll start by provisioning S3 object storage for you to access the data required for the model training process. This is part of the platform setup, and data scientists will not have to execute these steps for their day-to-day work.

Provisioning an S3 store

ML loves data. A lot of data! S3-compatible object storage software is becoming the de facto standard for storing and retrieving unstructured data at scale and is available on all three big cloud vendors. You can leverage Kubernetes-native open source tools such as MinIO to provision an S3-compatible object store on your OpenShift cluster. MinIO is a high-performance, S3-compatible object store that can be deployed on OpenShift, through which you can use it on-premises and in the cloud.

Red Hat also provides an integrated storage component on the OpenShift platform, named Open Data Foundation, that provides an S3-compatible API. Any standard S3-compatible object storage product will work with ODS. For this book, we’ve chosen MinIO for simplicity. So, let’s start by installing MinIO on the OpenShift platform.

From the code repository for this book, go to the chapter3 folder and find the minio-complete.yaml file. The following steps show how...

Using ML frameworks in OpenShift

So far, you have seen how easy it is to spin up environments based on your chosen configuration. Red Hat provides a list of pre-built images with popular frameworks to speed up your development workflow. We all know how troublesome it is to maintain multiple runtimes and frameworks with multiple library dependencies. Say you want to start a new environment with TensorFlow. You just select the right container image, as shown in the following screenshot. The View package information option provides you with details on what version and library set is available in the container image. The list of available container images is always growing; later, you will learn how to provide custom container images if required:

Figure 3.18 – RHODS – workbench with TensorFlow image

You may have multiple workbenches with different hardware and software. All these environments are listed under your data science project. You can...

Using GPU acceleration for model training

In the previous section, you customized software components that your team needs to build models. In this section, you will see how RHODS makes it easy for you to use specific hardware for your workbench.

Imagine that you are working on a simple supervised learning model, and you do not need any specific hardware, such as a GPU, to complete your work. If you work on laptops, then the hardware is fixed and shipped with your laptop. You cannot change it dynamically and it would be expensive for organizations to give every data scientist specialized GPU hardware. It’s worse if there is a new model of the GPU and you already bought an older version for your team. RHODS enables you to provision hardware on-demand for your team, so if one member needs a GPU, they can just select it from the UI and start using it. Then, when their work is done, the GPU is released back to the hardware pool. This dynamic nature not only reduces costs but...

Enabling GPU support

First, you need to start provisioning nodes with a GPU. Like MinIO, this is a one-time activity that will be executed by the platform engineering team. The entire process of enabling the GPU can be automated for your OpenShift clusters. Let’s learn how to provision the machines with a GPU in our cluster.

OpenShift enables you to use machine sets to provision nodes – nodes where your container images would run. To enable GPU support for the Jupyter environment that you created earlier, you need to provision nodes with a GPU. Once the nodes with the GPU have been provisioned, RHODS will automatically detect them and allow you to use the Jupyter environment with GPU support.

For the ROSA cluster, you can use the Red Hat cloud console to provision new machines. OpenShift can scale out machines so that it provisions the hardware as needed. You can also choose to use the machine on spot instances to further reduce your bill. Let’s see how to...

Building custom notebooks

Though RHODS comes with a few in-built notebooks that you can use, you may require a different library version and/or dependency, or you may want to add your organization certificates to the notebook. The point is there can be many reasons why the provided notebooks may require some tuning.

In this section, you will learn how to tune the existing notebook images, import third-party notebook images, and create your custom notebook images.

RHODS allows you to bring notebook images into the platform, either by importing an existing container image from a registry such as DockerHub, Quay, or any other container registry, or by customizing an existing notebook image. Let’s look at how to create custom notebook images and import them into RHODS.

Creating a custom notebook image

Creating custom notebook images follows a standard container image build process. This involves creating a Dockerfile that describes how the container image is to be built...

Summary

In this chapter, you learned how to use the core features of RHODS. You learned how to create and manage data science projects, workbenches, storage, and data connections.

You also saw how RHODS does the heavy lifting for hardware and software provisioning for your model development workflow. This includes learning how to take advantage of GPUs through machine pools. This dynamic model development environment enables your team to be more agile and focus on model building instead of managing the libraries.

Finally, you learned how to extend the base images to create a set of environments that is more suited to your needs. There, you learned how to create and use custom notebook images in RHODS. This allows you to further customize and tailor the experiences of your data science team.

In the next chapter, you will learn how to build and package ML models for consumption.

The rest of the chapter is locked

You have been reading a chapter from

MLOps with Red Hat OpenShift

Published in: Jan 2024Publisher: PacktISBN-13: 9781805120230

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Authors (2)

Ross Brigoli

Ross Brigoli is a consulting architect at Red Hat, where he focuses on designing and delivering solutions around microservices architecture, DevOps, and MLOps with Red Hat OpenShift for various industries. He has two decades of experience in software development and architecture.
Read more about Ross Brigoli

Faisal Masood

Faisal Masood is a cloud transformation architect at AWS. Faisal's focus is to assist customers in refining and executing strategic business goals. Faisal main interests are evolutionary architectures, software development, ML lifecycle, CD and IaC. Faisal has over two decades of experience in software architecture and development.
Read more about Faisal Masood

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages