Reader small image

You're reading from  Practical Deep Learning at Scale with MLflow

Product typeBook
Published inJul 2022
PublisherPackt
ISBN-139781803241333
Edition1st Edition
Right arrow
Author (1)
Yong Liu
Yong Liu
author image
Yong Liu

Yong Liu has been working in big data science, machine learning, and optimization since his doctoral student years at the University of Illinois at Urbana-Champaign (UIUC) and later as a senior research scientist and principal investigator at the National Center for Supercomputing Applications (NCSA), where he led data science R&D projects funded by the National Science Foundation and Microsoft Research. He then joined Microsoft and AI/ML start-ups in the industry. He has shipped ML and DL models to production and has been a speaker at the Spark/Data+AI summit and NLP summit. He has recently published peer-reviewed papers on deep learning, linked data, and knowledge-infused learning at various ACM/IEEE conferences and journals.
Read more about Yong Liu

Right arrow

Chapter 3: Tracking Models, Parameters, and Metrics

Given that MLflow can support multiple scenarios through the life cycle of DL models, it is common to use MLflow's capabilities incrementally. Usually, people start with MLflow tracking since it is easy to use and can handle many scenarios for reproducibility, provenance tracking, and auditing purposes. In addition, tracking the history of a model from cradle to sunset not only goes beyond the data science experiment management domain but is also important for model governance in the enterprise, where business and regulatory risks need to be managed for using models in production. While the precise business values of tracking models in production are still evolving, the need for tracking a model's entire life cycle is unquestionable and growing. For us to be able to do this, we will begin this chapter by setting up a full-fledged local MLflow tracking server.

We will then take a deep dive into how we can track a model...

Technical requirements

The following are the requirements you will need to follow the instructions provided in this chapter:

Setting up a full-fledged local MLflow tracking server

In Chapter 2, Getting Started with MLflow for Deep Learning, we gained hands-on experience working with a local filesystem-based MLflow tracking server and inspecting the components of the MLflow experiment. However, there are limitations with a default local filesystem-based MLflow server as the model registry functionality is not available. The benefit of having a model registry is that we can register the model, version control the model, and prepare for model deployment into production. Therefore, this model registry will bridge the gap between offline experimentation and an online deployment production scenario. Thus, we need a full-fledged MLflow tracking server with the following stores to track the complete life cycle of a model:

  • Backend store: A relational database backend is needed to support MLflow's storage of metadata (metrics, parameters, and many others) about the experiment. This also allows the query...

Tracking model provenance

Provenance tracking for digital artifacts has been long studied in the literature. For example, when you're using a piece of patient diagnosis data in the biomedical industry, people usually want to know where it comes from, what kind of processing and cleaning has been done to the data, who owns the data, and other history and lineage information about the data. The rise of ML/DL models for industrial and business scenarios in production makes provenance tracking a required functionality. The different granularities of provenance tracking are critical for operationalizing and managing not just the data science offline experimentation, but also before/during/after the model is deployed in production. So, what needs to be tracked for provenance?

Understanding the open provenance tracking framework

Let's look at a general provenance tracking framework to understand the big picture of why provenance tracking is a major effort. The following diagram...

Tracking model metrics

The default metric for the text classification model in the PyTorch lightning-flash package is Accuracy. If we want to change the metric to F1 score (a harmonic mean of precision and recall), which is a very common metric for measuring a classifier's performance, then we need to change the configuration of the classifier model before we start the model training process. Let's learn how to make this change and then use MLflow's non-auto-logging API to log the metrics:

  1. When defining the classifier variable, instead of using the default metric, we will pass a metric function called torchmetrics.F1 as a variable, as follows:
    classifier_model = TextClassifier(backbone="prajjwal1/bert-tiny", num_classes=datamodule.num_classes, metrics=torchmetrics.F1(datamodule.num_classes))

This uses the built-in metrics function of torchmetrics, the F1 module, along with the number of classes in the data we need to classify as a parameter. This...

Tracking model parameters

As we have already seen, there are lots of benefits of using auto-logging in MLflow, but if we want to track additional model parameters, we can either use MLflow to log additional parameters on top of what auto-logging records, or directly use MLflow to log all the parameters we want without using auto-logging at all.

Let's walk through a notebook without using MLflow auto-logging. If we want to have full control of what parameters will be logged by MLflow, we can use two APIs: mlflow.log_param and mlflow.log_params. The first one logs a single pair of key-value parameters, while the second logs an entire dictionary of key-value parameters. So, what kind of parameters might we be interested in tracking? The following answers this:

  • Model hyperparameters: Hyperparameters are defined before the learning process begins, which means they control how the learning process learns. These parameters can be turned and can directly affect how well...

Summary

In this chapter, we set up a local MLflow development environment that has full support for backend storage and artifact storage using MySQL and the MinIO object store. This will be very useful for us when we develop MLflow-supported DL models in this book. We started by presenting the open provenance tracking framework and asked model provenance tracking questions that are of interest. We worked on addressing the issues of auto-logging and successfully registered a trained model by loading a trained model from a logged model in MLflow for prediction using the mlflow.pytorch.load_model API. We also experimented on how to directly use MLflow's log_metrics, log_params, and log_model APIs without auto-logging, which gives us more control and flexibility over how we can log additional or customized metrics and parameters. We were able to answer many of the provenance questions by performing model provenance tracking, as well as by providing a couple of the questions that require...

Further reading

To learn more about the topics that were covered in this chapter, take a look at the following resources:

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Practical Deep Learning at Scale with MLflow
Published in: Jul 2022Publisher: PacktISBN-13: 9781803241333
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Yong Liu

Yong Liu has been working in big data science, machine learning, and optimization since his doctoral student years at the University of Illinois at Urbana-Champaign (UIUC) and later as a senior research scientist and principal investigator at the National Center for Supercomputing Applications (NCSA), where he led data science R&D projects funded by the National Science Foundation and Microsoft Research. He then joined Microsoft and AI/ML start-ups in the industry. He has shipped ML and DL models to production and has been a speaker at the Spark/Data+AI summit and NLP summit. He has recently published peer-reviewed papers on deep learning, linked data, and knowledge-infused learning at various ACM/IEEE conferences and journals.
Read more about Yong Liu