You're reading from Practical Deep Learning at Scale with MLflow

Product type Book

Published in Jul 2022

Publisher Packt

ISBN-13 9781803241333

Pages 288 pages

Edition 1st Edition

Languages

Concepts

Deep Learning

Author (1):

Yong Liu

Table of Contents (17) Chapters

Preface

Section 1 - Deep Learning Challenges and MLflow Prime

Chapter 1: Deep Learning Life Cycle and MLOps Challenges

Chapter 2: Getting Started with MLflow for Deep Learning

Section 2 – Tracking a Deep Learning Pipeline at Scale

Chapter 3: Tracking Models, Parameters, and Metrics

Chapter 4: Tracking Code and Data Versioning

Section 3 – Running Deep Learning Pipelines at Scale

Chapter 5: Running DL Pipelines in Different Environments

Chapter 6: Running Hyperparameter Tuning at Scale

Section 4 – Deploying a Deep Learning Pipeline at Scale

Chapter 7: Multi-Step Deep Learning Inference Pipeline

Chapter 8: Deploying a DL Inference Pipeline at Scale

Section 5 – Deep Learning Model Explainability at Scale

Chapter 9: Fundamentals of Deep Learning Explainability

Chapter 10: Implementing DL Explainability with MLflow

Other Books You May Enjoy

Preface

Starting from AlexNet in 2012, which won the large-scale ImageNet competition, to the BERT pre-trained language model in 2018, which topped many natural language processing (NLP) leaderboards, the revolution of modern deep learning (DL) in the broader artificial intelligence (AI) and machine learning (ML) community continues. Yet, the challenges of moving these DL models from offline experimentation to a production environment remain. This is largely due to the complexity and lack of a unified open source framework for supporting the full life cycle development of DL. This book will help you understand the big picture of DL full life cycle development, and implement DL pipelines that can scale from a local offline experiment to a distributed environment and online production clouds, with an emphasis on hands-on project-based learning to support the end-to-end DL process using the popular open source MLflow framework.

The book starts with an overview of the DL full life cycle and the emerging machine learning operations (MLOps) field, providing a clear picture of the four pillars of DL (data, model, code, and explainability) and the role of MLflow in these areas. A basic transfer learning-based NLP sentiment model using PyTorch Lightning Flash is built in the first chapter, which is further developed, tuned, and deployed to production throughout the rest of the book. From there onward, it guides you step-by-step to understand the concept of MLflow experiments and usage patterns, using MLflow as a unified framework to track DL data, code and pipeline, model, parameters, and metrics at scale. We'll run DL pipelines in a distributed execution environment with reproducibility and provenance tracking, and tune DL models through hyperparameter optimization (HPO) with Ray Tune, Optuna and HyperBand. We'll also build a multi-step DL inference pipeline with preprocessing and postprocessing steps, deploy a DL inference pipeline for production using Ray Serve and AWS SageMaker, and finally, provide a DL Explanation-as-a-Service using SHapley Additive exPlanations (SHAP) and MLflow integration.

By the end of this book, you'll have the foundation and hands-on experience to build a DL pipeline from initial offline experimentation to final deployment and production, all within a reproducible and open source framework. Along the way, you will also learn the unique challenges with DL pipelines and how we overcome them with practical and scalable solutions such as using multi-core CPUs, graphical processing units (GPUs), distributed and parallel computing frameworks, and the cloud.

Who this book is for

This book is written for data scientists, ML engineers, and AI practitioners who want to master the full life cycle of DL development from inception to production using the open source MLflow framework and related tools such as Ray Tune, SHAP, and Ray Serve. The scalable, reproducible, and provenance-aware implementations presented in this book ensure you build an enterprise-grade DL pipeline successfully. This book will support anyone building powerful DL cloud applications.

What this book covers

Chapter 1, Deep Learning Life Cycle and MLOps Challenges, covers the five stages of the full life cycle of DL and the first DL model in this book using the transfer learning approach for text sentiment classification. It also defines the concept of MLOps along with the three foundation layers and four pillars, and the roles of MLflow in these areas. An overview of the challenges in DL data, model, code, and explainability are also presented. This chapter is designed to bring everyone to the same foundational level and provides clarity and guidelines on the scope of the rest of the book.

Chapter 2, Getting Started with MLflow for Deep Learning, serves as an MLflow primer and a first hands-on learning module to quickly set up a local filesystem-based MLflow tracking server or interact with a remote managed MLflow tracking server in Databricks, and perform a first DL experiment using MLflow auto logging. It also explains some foundational MLflow concepts through concrete examples such as experiments, runs, metadata about and the relationship between experiments and runs, code tracking, model logging, and model flavor. Specifically, we underline that experiments should be first-class entities that can be used to bridge the gap between the offline and online production life cycle of DL models. This chapter builds the foundational knowledge of MLflow.

Chapter 3, Tracking Models, Parameters, and Metrics, covers the first in-depth learning module on tracking using a fully-fledged local MLflow tracking server. It starts with setting up a local fully-fledged MLflow tracking server that runs in Docker Desktop, with a MySQL backend store and a MinIO artifact store. Before implementing tracking, this chapter provides an open provenance tracking framework based on the open provenance model vocabulary specification, and presents six types of provenance questions that could be implemented by using MLflow. It then provides hands-on implementation examples on how to use MLflow model-logging APIs and registry APIs to track model provenance, model metrics, and parameters, with or without auto logging. Unlike other typical MLflow API tutorials, which only provide guidance on using the APIs, this chapter instead focuses on how successfully we can use MLflow to answer the provenance questions. By the end of this chapter, we could answer four out of six provenance questions, and the remaining two questions can only be answered when we have a multi-step pipeline or deployment to production, which are covered in the later chapters.

Chapter 4, Tracking Code and Data Versioning, covers the second in-depth learning module on MLflow tracking. It analyzes the current practices on the usage of notebooks and pipelines in the ML/DL projects. It recommends using VS Code notebooks and shows a concrete DL notebook example that can be run either interactively or non-interactively with MLflow tracking enabled. It also recommends using MLflow's MLproject to implement a multi-step DL pipeline using MLflow's entry points and pipeline chaining. A three-step DL pipeline is created for DL model training and registration. In addition, it also shows the pipeline level tracking and individual step tracking through the parent-child nested run in MLflow. Finally, it shows how to track public and privately built Python libraries and data versioning in Delta Lake using MLflow.

Chapter 5, Running DL Pipelines in Different Environments, covers how to run a DL pipeline in different environments. It starts with the scenarios and requirements for executing DL pipelines in different environments. It then shows how to use MLflow's command-line interface (CLI) to submit runs in four scenarios: running locally with local code, running locally with remote code in GitHub, running remotely in the cloud with local code, and running remotely in the cloud with remote code in GitHub. The flexibility and reproducibility supported by MLflow to execute a DL pipeline also provide building blocks for continuous integration/continuous deployment (CI/CD) automation when needed.

Chapter 6, Running Hyperparameter Tuning at Scale, covers using MLflow to support HPO at scale using state-of-the-art HPO frameworks such as Ray Tune. It starts with a review of the types and challenges of DL pipeline hyperparameters. Then, it compares three HPO frameworks Ray Tune, Optuna, and HyperOpt, and provides a detailed analysis of the pros and cons and their integration maturity with MLflow. It then recommends and shows how to use Ray Tune with MLflow to do HPO tuning for the DL model we have been working on in this book so far. Furthermore, it covers how to switch to other HPO search and scheduler algorithms such as Optuna and HyperBand. This enables us to produce high-performance DL models that meet the business requirements in a cost-effective and scalable way.

Chapter 7, Multi-Step Deep Learning Inference Pipeline, covers creating a multi-step inference pipeline using MLflow's custom Python model approach. It starts with an overview of four patterns of inference workflows in production where a single trained model is usually not enough to meet the business application requirements. Additional preprocessing and postprocessing steps are needed. It then presents a step-by-step guide to implementing a multi-step inference pipeline that wraps the previously fine-tuned DL sentiment model with language detection, caching, and additional model metadata. This inference pipeline is then logged as a generic MLflow PyFunc model that can be loaded using the common MLflow PyFunc load API. Having an inference pipeline wrapped as an MLflow model opens doors for automation and consistent management of the model pipeline within the same MLflow framework.

Chapter 8, Deploying a DL Inference Pipeline at Scale, covers deploying a DL inference pipeline into different host environments for production usage. It starts with an overview of the landscape of deployment and hosting environments including batch inference and streaming inference at scale. It then describes the different deployment mechanisms such as MLflow built-in model serving tools, custom deployment plugins, and generic model serving frameworks such as Ray Serve. It shows examples of how to deploy a batch inference pipeline using MLflow's Spark user-defined function (UDF), and how to serve a DL inference pipeline as a local web service using either MLflow's built-in model serving tool or Ray Serve's MLflow deployment plugin, mlflow-ray-serve. It then describes a complete step-by-step guide to deploying a DL inference pipeline to a managed AWS SageMaker instance for production usage.

Chapter 9, Fundamentals of Deep Learning Explainability, covers the foundational concepts of explainability and exploration of using two popular explainability tools. It starts with an overview of the eight dimensions of explainability and explainable AI (XAI), then provides concrete learning examples to explore the usage of SHAP and Transformers-interpret toolboxes for an NLP sentiment pipeline. It emphasizes that explainability should be lifted to be the first-class artifact when developing a DL application since there are increasing demands and expectations for model and data explanation in various business applications and domains.

Chapter 10, Implementing DL Explainability with MLflow, covers how to implement DL explainability using MLflow to provide Explanation-as-a-Service (EaaS). It starts with an overview of MLflow's current capability to support explainers and explanations. Specifically, the existing integration with SHAP in MLflow APIs does not support DL explainability at scale. Therefore, it provides two generic ways of using MLflow's artifact logging APIs and PyFunc APIs for the implementation. Examples are provided for implementing SHAP explanation, which logs the SHAP value in a bar chart in an MLflow tracking server's artifact store. A SHAP explainer can be logged as an MLflow Python model, and then loaded as either a Spark UDF for batch explanation or as a web service for online EaaS. This provides maximal flexibility within a unified MLflow framework for implementing explainability.

To get the most out of this book

The majority of the code in this book can be implemented and executed using the open source MLflow tool, with a few exceptions where a 14-day full Databricks trial is needed (sign up at https://databricks.com/try-databricks) along with an AWS Free Tier account (sign up at https://aws.amazon.com/free/). The following lists some major software packages covered in this book:

MLflow 1.20.2 and above
Python 3.8.10
Lightning-flash 0.5.0
Transformers 4.9.2
SHAP 0.40.0
PySpark 3.2.1
Ray[tune] 1.9.2
Optuna 2.10.0

The complete package dependencies are listed in each chapter's requirements.txt file or the conda.yaml file in this book's GitHub repository. All code has been tested to run successfully in a macOS or Linux environment. If you are a Microsoft Windows user, it is recommended to install WSL2 to run the bash scripts provided in this book: https://www.windowscentral.com/how-install-wsl2-windows-10. It is a known issue that the MLflow CLI does not work properly in the Microsoft Windows command line.

Starting from Chapter 3, Tracking Models, Parameters, and Metrics of this book, you will also need to have Docker Desktop (https://www.docker.com/products/docker-desktop/) installed to set up a fully-fledged local MLflow tracking server for executing the code in this book. AWS SageMaker is needed in Chapter 8, Deploying a DL Inference Pipeline at Scale, for the cloud deployment example. VS Code version 1.60 or above (https://code.visualstudio.com/updates/v1_60) is used as the integrated development environment (IDE) in this book. Miniconda version 4.10.3 or above (https://docs.conda.io/en/latest/miniconda.html) is used throughout this book for creating and activating virtual environments.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book's GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Finally, to get the most out of this book, you should have experience in programming in Python and have a basic understanding of popular ML and data manipulation libraries such as pandas and PySpark.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Practical-Deep-Learning-at-Scale-with-MLFlow. If there's an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781803241333_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "For learning purposes, we have provided two example mlruns artifacts and the huggingface cache folder in the GitHub repository under the chapter08 folder."

A block of code is set as follows:

client = boto3.client('sagemaker-runtime')

response = client.invoke_endpoint(

        EndpointName=app_name,

        ContentType=content_type,

        Accept=accept,

        Body=payload

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

loaded_model = mlflow.pyfunc.spark_udf(

    spark,

    model_uri=logged_model,

    result_type=StringType())

Any command-line input or output is written as follows:

mlflow models serve -m models:/inference_pipeline_model/6

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: "To execute the code in this cell, you can just click on Run Cell in the top-right drop-down menu."

Tips or Important Notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you've read Practical Deep Learning at Scale with MLflow, we'd love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we're delivering excellent quality content.