You're reading from Practical Deep Learning at Scale with MLflow

Product typeBook

Published inJul 2022

PublisherPackt

ISBN-139781803241333

Edition1st Edition

Concepts

Deep Learning

Author (1)

Yong Liu

Chapter 8: Deploying a DL Inference Pipeline at Scale

Deploying a deep learning (DL) inference pipeline for production usage is both exciting and challenging. The exciting part is that, finally, the DL model pipeline can be used for prediction with real-world production data, which will provide real value to the business scenarios. However, the challenging part is that there are different DL model serving platforms and host environments. It is not easy to choose the right framework for the right model serving scenarios, which can minimize deployment complexity but provide the best model serving experiences in a scalable and cost-effective way. This chapter will cover the topics as an overview of different deployment scenarios and host environments, and then provide hands-on learning on how to deploy to different environments, including local and remote cloud environments using MLflow deployment tools. By the end of this chapter, you should be able to confidently deploy an MLflow DL...

Technical requirements

The following items are required for this chapter's learning:

GitHub repository code for this chapter: https://github.com/PacktPublishing/Practical-Deep-Learning-at-Scale-with-MLFlow/tree/main/chapter08.
Ray serve and mlflow-ray-serve plugin: https://github.com/ray-project/mlflow-ray-serve.
AWS SageMaker: You will need to have an AWS account. You can create a free AWS account easily through the free signup website at https://aws.amazon.com/free/.
AWS command-line interface (CLI): https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html.
Docker Desktop: https://www.docker.com/products/docker-desktop/.
Complete the example in Chapter 7, Multi-Step Deep Learning Inference Pipeline, of this book. This will give you a ready-to-deploy inference pipeline to use in this chapter.

Understanding different deployment tools and host environments

There are different deployment tools in the MLOps technology stack that have different target use cases and host environments for deploying different model inference pipelines. In Chapter 7, Multi-Step Deep Learning Inference Pipeline, we learned the different inference scenarios and requirements and implemented a multi-step DL inference pipeline that can be deployed into a model hosting/serving environment. Now, we will learn how to deploy such a model to a few specific model hosting and serving environments. This is visualized in Figure 8.1 as follows:

Figure 8.1 – Using model deployment tools to deploy a model inference pipeline to a model hosting and serving environment

As can be seen from Figure 8.1, there can be different deployment tools for different model hosting and serving environments. Here, we list the three typical scenarios as follows:

Batch inference at scale: If we...

Deploying locally for batch and web service inference

For development and testing purposes, we usually need to deploy our model locally to verify it works as expected. Let's see how to do it for two scenarios: batch inference and web service inference.

Batch inference

For batch inference, follow these instructions:

Make sure you have completed Chapter 7, Multi-Step Deep Learning Inference Pipeline. This will produce an MLflow pyfunc DL inference model pipeline URI that can be loaded using standard MLflow Python functions. The logged model can be uniquely located by the run_id and model name as follows:
```
logged_model = 'runs:/37b5b4dd7bc04213a35db646520ec404/inference_pipeline_model'
```

The model can also be identified by the model name and version number using the model registry as follows:

logged_model = 'models:/inference_pipeline_model/6'

Follow the instructions under the Batch inference at-scale using PySpark UDF function section...

Deploying using Ray Serve and MLflow deployment plugins

A more generic way to do deployment is to use a framework such as Ray Serve (https://docs.ray.io/en/latest/serve/index.html). Ray Serve has several advantages, such as DL model frameworks agnostics, native Python support, and supporting complex model composition inference patterns. Ray Serve supports all major DL frameworks and any arbitrary business logic. So, can we leverage both Ray Serve and MLflow to do model deployment and serve? The good news is that we can use the MLflow deployment plugins provided by Ray Serve to do this. Let's walk through how to use the mlflow-ray-serve plugin to do MLflow model deployment using Ray Serve (https://github.com/ray-project/mlflow-ray-serve). Before we begin, we need to install the mlflow-ray-serve package:

pip install mlflow-ray-serve

Then, we need to start a single node Ray cluster locally first using the following two commands:

ray start --head
serve start

This will...

Deploying to AWS SageMaker – a complete end-to-end guide

AWS SageMaker has a cloud-hosted model service managed by AWS. We will use AWS SageMaker as an example to show you how to deploy to a remote cloud provider for hosted web services that can serve real production traffic. AWS SageMaker has a suite of ML/DL-related services including supporting annotation and model training and many more. Here, we show how to bring your own model (BYOM) for deployment. This means that you have a model inference pipeline trained outside of AWS SageMaker, and now just need to deploy to SageMaker for hosting. Follow the next steps to prepare and deploy a DL sentiment model. A few prerequisites are required:

You must have Docker Desktop running in your local environment.
You must have an AWS account. You can create a free AWS account easily through the free signup website at https://aws.amazon.com/free/.

Once you have these requirements , activate the dl-model-chapter08 conda...

Summary

In this chapter, we have learned different ways to deploy an MLflow inference pipeline model for both batch inference and online real-time inference. We started with a brief survey on different model serving scenarios (batch, streaming, and on-device) and looked at three different categories of tools for MLflow model deployment (the MLflow built-in deployment tool, MLflow deployment plugins, and generic model inference serving frameworks that could work with the MLflow inference model). Then, we covered several local deployment scenarios, using the PySpark UDF function to do batch inference and MLflow local deployment for web service. Afterward, we learned how to use Ray Serve in conjunction with the mlflow-ray-serve plugin to deploy an MLflow Python inference pipeline model into a local Ray cluster. This opens doors to deploy to any cloud platform such as AWS, Azure ML, or GCP, as long as we can set up a Ray cluster in the cloud. Finally, we provided a complete end-to-end...

An Introduction to TinyML: https://towardsdatascience.com/an-introduction-to-tinyml-4617f314aa79
Performance Optimizations and MLFlow Integrations – Seldon Core 1.10.0 Released: https://www.seldon.io/performance-optimizations-and-mlflow-integrations-seldon-core-1-10-0-released/
Ray & MLflow: Taking Distributed Machine Learning Applications to Production: https://medium.com/distributed-computing-with-ray/ray-mlflow-taking-distributed-machine-learning-applications-to-production-103f5505cb88
Managing your machine learning lifecycle with MLflow and Amazon SageMaker: https://aws.amazon.com/blogs/machine-learning/managing-your-machine-learning-lifecycle-with-mlflow-and-amazon-sagemaker/
Deploy A Locally Trained ML Model In Cloud Using AWS SageMaker: https://medium.com/geekculture/84af8989d065
PyTorch vs TensorFlow in 2022: https://www.assemblyai.com/blog/pytorch-vs-tensorflow-in-2022/
Try Databricks: Free Trial or Community Edition...

The rest of the chapter is locked

You have been reading a chapter from

Practical Deep Learning at Scale with MLflow

Published in: Jul 2022Publisher: PacktISBN-13: 9781803241333

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Yong Liu

Yong Liu has been working in big data science, machine learning, and optimization since his doctoral student years at the University of Illinois at Urbana-Champaign (UIUC) and later as a senior research scientist and principal investigator at the National Center for Supercomputing Applications (NCSA), where he led data science R&D projects funded by the National Science Foundation and Microsoft Research. He then joined Microsoft and AI/ML start-ups in the industry. He has shipped ML and DL models to production and has been a speaker at the Spark/Data+AI summit and NLP summit. He has recently published peer-reviewed papers on deep learning, linked data, and knowledge-infused learning at various ACM/IEEE conferences and journals.
Read more about Yong Liu

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Practical Deep Learning at Scale with MLflow

Chapter 8: Deploying a DL Inference Pipeline at Scale

Technical requirements

Understanding different deployment tools and host environments

Deploying locally for batch and web service inference

Batch inference

Deploying using Ray Serve and MLflow deployment plugins

Deploying to AWS SageMaker – a complete end-to-end guide

Summary

Further reading

Unlock this book and the full library FREE for 7 days

Author (1)

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook