Reader small image

You're reading from  Engineering MLOps

Product typeBook
Published inApr 2021
PublisherPackt
ISBN-139781800562882
Edition1st Edition
Right arrow
Author (1)
Emmanuel Raj
Emmanuel Raj
author image
Emmanuel Raj

Emmanuel Raj is a Finland-based Senior Machine Learning Engineer with 6+ years of industry experience. He is also a Machine Learning Engineer at TietoEvry and a Member of the European AI Alliance at the European Commission. He is passionate about democratizing AI and bringing research and academia to industry. He holds a Master of Engineering degree in Big Data Analytics from Arcada University of Applied Sciences. He has a keen interest in R&D in technologies such as Edge AI, Blockchain, NLP, MLOps and Robotics. He believes "the best way to learn is to teach", he is passionate about sharing and learning new technologies with others.
Read more about Emmanuel Raj

Right arrow

Chapter 12: Model Serving and Monitoring

In this chapter, we will reflect on the need to serve and monitor machine learning (ML) models in production and explore different means of serving ML models for users or consumers of the model. Then, we will revisit the Explainable Monitoring framework from Chapter 11, Key Principles for Monitoring Your ML System, and implement it for the business use case we have been solving using MLOps to predict the weather. The implementation of an Explainable Monitoring framework is hands-on. We will infer the deployed API and monitor and analyze the inference data using drifts (such as data drift, feature drift, and model drift) to measure the performance of an ML system. Finally, we will look at several concepts to govern ML systems for the robust performance of ML systems to drive continuous learning and delivery.

Let's start by reflecting on the need to monitor ML in production. Then, we will move on to explore the following topics in this...

Serving, monitoring, and maintaining models in production

There is no point in deploying a model or an ML system and not monitoring it. Monitoring performance is one of the most important aspects of an ML system. Monitoring enables us to analyze and map out the business impact an ML system offers to stakeholders in a qualitative and quantitative manner. In order to achieve maximum business impact, users of ML systems need to be served in the most convenient manner. After that, they can consume the ML system and generate value. In previous chapters, we developed and deployed an ML model to predict the weather conditions at a port as part of the business use case that we had been solving for practical implementation. In this chapter, we will revisit the Explainable Monitoring framework that we discussed in Chapter 11, Key Principles for Monitoring Your ML System, and implement it within our business use case. In Figure 12.1, we can see the Explainable Monitoring framework and some of...

Exploring different modes of serving ML models

In this section, we will consider how a model can be served for users (both humans and machines) to consume the ML service efficiently. Model serving is a critical area, which an ML system needs to succeed at to fulfill its business impact, as any lag or bug in this area can be costly in terms of serving users. Robustness, availability, and convenience are key factors to keep in mind while serving ML models. Let's take a look at some ways in which ML models can be served: this can be via batch service or on-demand mode (for instance, when a query is made on demand in order to get a prediction). A model can be served to either a machine or a human user in on-demand mode. Here is an example of serving a model to a user:

Figure 12.2 – Serving a model to users

In a typical scenario (in on-demand mode), a model is served as a service for users to consume, as shown in Figure 12.2. Then, an external application...

Implementing the Explainable Monitoring framework

To implement the Explainable Monitoring framework, it is worth doing a recap of what has been discussed so far, in terms of implementing hypothetical use cases. Here is a recap of what we did for our use case implementation, including the problem and solution:

  • Problem context: You work as a data scientist in a small team with three other data scientists for a cargo shipping company based in the port of Turku in Finland. 90% of the goods imported into Finland arrive via cargo shipping at various ports across the country. For cargo shipping, weather conditions and logistics can be challenging at times. Rainy conditions can distort operations and logistics at the ports, which can affect supply chain operations. Forecasting rainy conditions in advance allows us to optimize resources such as human resources, logistics, and transport resources for efficient supply chain operations at ports. Business-wise, forecasting rainy conditions...

Governing your ML system

A great part of system governance involves quality assurance and control, model auditing, and reporting to have end-to-end trackability and compliance with regulations. The ML systems' efficacy (that is, its ability to produce a desired or intended result) is dependent on the way it is governed to achieve maximum business value. So far, we have monitored and analyzed our deployed model for inference data:

Figure 12.27 – Components of governing your ML system

Figure 12.27 – Components of governing your ML system

The efficacy of an ML system can be determined by using smart actions that are taken based on monitoring and alerting. In the next chapter, we will explore ML system governance in terms of alerts and actions, model QA and control, and model auditing and reports.

Summary

In this chapter, we learned about the key principles of serving ML models to our users and monitoring them to achieve maximized business value. We explored the different means of serving ML models for users or consumers of the model and implemented the Explainable Monitoring framework for a hypothetical business use case and deployed a model. We carried out this hands-on implementation of an Explainable Monitoring framework to measure the performance of ML systems. Finally, we discussed the need for governing ML systems to ensure the robust performance of ML systems.

We will further explore the governance of ML systems and continual learning concepts in the next and final chapter!

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Engineering MLOps
Published in: Apr 2021Publisher: PacktISBN-13: 9781800562882
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Emmanuel Raj

Emmanuel Raj is a Finland-based Senior Machine Learning Engineer with 6+ years of industry experience. He is also a Machine Learning Engineer at TietoEvry and a Member of the European AI Alliance at the European Commission. He is passionate about democratizing AI and bringing research and academia to industry. He holds a Master of Engineering degree in Big Data Analytics from Arcada University of Applied Sciences. He has a keen interest in R&D in technologies such as Edge AI, Blockchain, NLP, MLOps and Robotics. He believes "the best way to learn is to teach", he is passionate about sharing and learning new technologies with others.
Read more about Emmanuel Raj