Reader small image

You're reading from  Machine Learning Infrastructure and Best Practices for Software Engineers

Product typeBook
Published inJan 2024
Reading LevelIntermediate
PublisherPackt
ISBN-139781837634064
Edition1st Edition
Languages
Right arrow
Author (1)
Miroslaw Staron
Miroslaw Staron
author image
Miroslaw Staron

Miroslaw Staron is a professor of Applied IT at the University of Gothenburg in Sweden with a focus on empirical software engineering, measurement, and machine learning. He is currently editor-in-chief of Information and Software Technology and co-editor of the regular Practitioner's Digest column of IEEE Software. He has authored books on automotive software architectures, software measurement, and action research. He also leads several projects in AI for software engineering and leads an AI and digitalization theme at Software Center. He has written over 200 journal and conference articles.
Read more about Miroslaw Staron

Right arrow

Designing Machine Learning Pipelines (MLOps) and Their Testing

MLOps, short for machine learning (ML) operations, is a set of practices and techniques aimed at streamlining the deployment, management, and monitoring of ML models in production environments. It borrows concepts from the DevOps (development and operations) approach, adapting them to the unique challenges posed by ML.

The main goal of MLOps is to bridge the gap between data science and operations teams, fostering collaboration and ensuring that ML projects can be effectively and reliably deployed at scale. MLOps helps to automate and optimize the entire ML life cycle, from model development to deployment and maintenance, thus improving the efficiency and effectiveness of ML systems in production.

In this chapter, we learn how ML systems are designed and operated in practice. The chapter shows how pipelines are turned into a software system, with a focus on testing ML pipelines and their deployment at Hugging Face...

What ML pipelines are

Undoubtedly, in recent years, the field of ML has witnessed remarkable advancements, revolutionizing industries and empowering innovative applications. As the demand for more sophisticated and accurate models grows, so does the complexity of developing and deploying them effectively. The industrial introduction of ML systems called for more rigorous testing and validation of these ML-based systems. In response to these challenges, the concept of ML pipelines has emerged as a crucial framework to streamline the entire ML development process, from data preprocessing and feature engineering to model training and deployment. This chapter explores the applications of MLOps in the context of both cutting-edge deep learning (DL) models such as Generative Pre-trained Transformer (GPT) and traditional classical ML models.

We begin by exploring the underlying concepts of ML pipelines, stressing their importance in organizing the ML workflow and promoting collaboration...

ML pipelines – how to use ML in the system in practice

Training and validating ML models on a local platform is the beginning of the process of using an ML pipeline. After all, it would be of limited use if we had to retrain the ML models on every computer from our customers.

Therefore, we often deploy ML models to a model repository. There are a few popular ones, but the one that is used by the largest community is the HuggingFace repository. In that repository, we can deploy both the models and datasets and even create spaces where the models can be used for experiments without the need to download them. Let us deploy the model trained in Chapter 11 to that repository. For that, we need to have an account at huggingface.com, and then we can start.

Deploying models to HuggingFace

First, we need to create a new model using the New button on the main page, as in Figure 12.2:

Figure 12.2 – New button to create a model

Figure 12.2 – New button to create a model

Then, we fill...

Raw data-based pipelines

Creating a full pipeline can be a daunting task and requires creating customized tools for all models and all kinds of data. It allows us to optimize how we use the models, but it requires a lot of effort. The main rationale behind pipelines is that they link two areas of ML – the model and its computational capabilities with the task and the data from the domain. Luckily for us, the main model hubs such as HuggingFace have an API that provides ML pipelines automatically. Pipelines in HuggingFace are related to the model and provided by the framework based on the model’s architecture, input, and output.

Pipelines for NLP-related tasks

Text classification is a pipeline designed to classify text input into predefined categories or classes. It’s particularly useful for tasks such as sentiment analysis (SA), topic categorization, spam detection, intent recognition, and so on. The pipeline typically employs pre-trained models fine-tuned...

Feature-based pipelines

Feature-based pipelines do not have specific classes because they are much lower level. They are the model.fit() and model.predict() statements from the standard Python ML implementation. These pipelines require software developers to prepare the data manually and also to take care of the results manually; that is, by implementing preprocessing steps such as converting data to tables using one-hot encoding and post-processing steps such as converting the data into a human-readable output.

An example of this kind of pipeline was the prediction of defects that we have seen in the previous parts of the book; therefore, they do not need to be repeated.

What is important, however, is that all pipelines are the way that link the ML domain with the software engineering domain. The first activity that I do after developing a pipeline is to test it.

Testing of ML pipelines

Testing of ML pipelines is done at multiple levels, starting with unit tests and moving up toward integration (component) tests and then to system and acceptance tests. In these tests, two elements are important – the model itself and the data (for the model and the oracle).

Although we can use the unit test framework included in Python, I strongly recommend using the Pytest framework instead, due to its simplicity and flexibility. We can install this framework by simply using this command:

>> pip install pytest

That will download and install the required packages.

Best practice #62

Use a professional testing framework such as Pytest.

Using a professional framework provides us with the compatibility required by MLOps principles. We can share our models, data, source code, and all other elements without the need for cumbersome setup and installation of the frameworks themselves. For Python, I recommend using the Pytest framework...

Monitoring ML systems at runtime

Monitoring pipelines in production is a critical aspect of MLOps to ensure the performance, reliability, and accuracy of deployed ML models. This includes several practices.

The first practice is logging and collecting metrics. This activity includes instrumenting the ML code with logging statements to capture relevant information during model training and inference. Key metrics to monitor are model accuracy, data drift, latency, and throughput. Popular logging and monitoring frameworks include Prometheus, Grafana, and Elasticsearch, Logstash, and Kibana (ELK).

The second one is alerting, which is a setup of alerts based on predefined thresholds for key metrics. This helps in proactively identifying issues or anomalies in the production pipeline. When an alert is triggered, the appropriate team members can be notified to investigate and address the problem promptly.

Data drift detection is the third activity, which includes monitoring the distribution...

Summary

Constructing ML pipelines concludes the part of the book that focuses on the core technical aspects of ML. Pipelines are important for ensuring that the ML models are used according to best practices in software engineering.

However, ML pipelines are still not a complete ML system. They can only provide inference of the data and provide an output. For the pipelines to function effectively, they need to be connected to other parts of the system such as the user interface and storage. That is the content of the next chapter.

References

  • A. Lima, L. Monteiro, and A.P. Furtado, MLOps: Practices, Maturity Models, Roles, Tools, and Challenges-A Systematic Literature Review. ICEIS (1), 2022: p. 308-320.
  • John, M.M., Olsson, H.H., and Bosch, J., Towards MLOps: A framework and maturity model. In 2021 47th Euromicro Conference on Software Engineering and Advanced Applications (SEAA). 2021. IEEE.
  • Staron, M. et al., Industrial experiences from evolving measurement systems into self‐healing systems for improved availability. Software: Practice and Experience, 2018. 48(3): p. 719-739.
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Machine Learning Infrastructure and Best Practices for Software Engineers
Published in: Jan 2024Publisher: PacktISBN-13: 9781837634064
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Miroslaw Staron

Miroslaw Staron is a professor of Applied IT at the University of Gothenburg in Sweden with a focus on empirical software engineering, measurement, and machine learning. He is currently editor-in-chief of Information and Software Technology and co-editor of the regular Practitioner's Digest column of IEEE Software. He has authored books on automotive software architectures, software measurement, and action research. He also leads several projects in AI for software engineering and leads an AI and digitalization theme at Software Center. He has written over 200 journal and conference articles.
Read more about Miroslaw Staron