Reader small image

You're reading from  Machine Learning Engineering with MLflow

Product typeBook
Published inAug 2021
PublisherPackt
ISBN-139781800560796
Edition1st Edition
Tools
Right arrow
Author (1)
Natu Lauchande
Natu Lauchande
author image
Natu Lauchande

Natu Lauchande is a principal data engineer in the fintech space currently tackling problems at the intersection of machine learning, data engineering, and distributed systems. He has worked in diverse industries, including biomedical/pharma research, cloud, fintech, and e-commerce/mobile. Along the way, he had the opportunity to be granted a patent (as co-inventor) in distributed systems, publish in a top academic journal, and contribute to open source software. He has also been very active as a speaker at machine learning/tech conferences and meetups.
Read more about Natu Lauchande

Right arrow

Chapter 12: Advanced Topics with MLflow

In this chapter, we will cover advanced topics to address common situations and use cases whereby you can leverage your MLflow knowledge by using different types of models from the ones exposed in the rest of the book, to ensure a breadth of feature coverage and exposure to assorted topics.

Specifically, we will look at the following sections in this chapter: 

  • Exploring MLflow use cases with AutoML
  • Intergrating MLflow with other languages
  • Understanding MLflow plugins

We will represent each of the cases with a brief description of the problem and solutions in a pattern format—namely, a problem context and a solution approach.

The different sections of this chapter don't present continuity as they address different issues.

Technical requirements

For this chapter, you will need the following prerequisites: 

Exploring MLflow use cases with AutoML

Executing an ML project requires a breadth of knowledge in multiple areas and, in a lot of cases, deep technical steps of expertise. One emergent technique to ease the adoption and accelerate time to market (TTM) in projects is the use of automated machine learning (AutoML), where some of the activities of the model developer are automated. It basically consists of automating steps in ML in a twofold approach, outlined as follows:

  • Feature selection: Using optimization techniques (for example, Bayesian techniques) to select the best features as input to a model
  • Modeling: Automatically identifying a set of models to use by testing multiple algorithms using hyperparameter optimization techniques

We will explore the integration of MLflow with an ML library called PyCaret (https://pycaret.org/) that allows us to leverage its AutoML techniques and log the process in MLflow so that you can automatically obtain the best performance...

Integrating MLflow with other languages

MLflow is primarily a tool ingrained in the Python ecosystem in the ML space. At its core, MLflow components provide a REpresentational State Transfer (REST) interface. As long as application programming interface (API) wrappers are made, the underlying code is accessible from any language with REST support. The REST interface is extensively documented in https://www.mlflow.org/docs/latest/rest-api.html; most of the integration into other languages is about providing layers to access the API in a concise, language-specific library.

MLflow Java example

Multiple teams in the ML space are inserted in a context where multiple languages are used. One of the most important platforms on large-scale distributed systems is Java Virtual Machine (JVM). Being able to implement systems that can interact with Java-based systems is paramount for a smooth integration of MLflow with the wider information technology (IT) infrastructure.

We will show...

Understanding MLflow plugins

As an ML engineer, multiple times in your project you can reach the limits of a framework. MLflow provides an extension system through its plugin features. A plugin architecture allows the extensibility and adaptability of a software system.

MLflow allows the creation of the following types of plugins:

  • Tracking store plugins: This type of plugin controls and tweaks the store that you use to log your experiment metrics in a specific type of data store.
  • Artifact repository: You are able to override the artifact repositories with your own storage system—for example, adding an artifact repository based on the Hadoop Distributed File System (HDFS) or any object store specific to your environment, overriding API calls such as log_artifact and download_artifacts.
  • Running context providers: You can update how your system logs information about the context—for instance, tags such as git_tags and repo_uri, and other relevant elements...

Summary

In this chapter, we addressed some use cases, with example MLflow pipelines. We looked at implementing AutoML in two different scenarios. Where we don't have targets, we will need to use anomaly detection as an unsupervised ML technique. The use of non-Python-based platforms was addressed, and we concluded with how to extend MLflow with plugins.

At this stage, we have addressed a good breadth and depth of topics in the area of ML engineering using MLflow. Your next step is definitely to explore more, and leverage on your project the techniques learned in this book.

Further reading

In order to further your knowledge, you can consult the documentation at the following links: 

Why subscribe?

  • Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
  • Improve your learning with Skill Plans built especially for you
  • Get a free eBook or video every month
  • Fully searchable for easy access to vital information
  • Copy and paste, print, and bookmark content

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at customercare@packtpub.com for more details.

At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Machine Learning Engineering with MLflow
Published in: Aug 2021Publisher: PacktISBN-13: 9781800560796
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Natu Lauchande

Natu Lauchande is a principal data engineer in the fintech space currently tackling problems at the intersection of machine learning, data engineering, and distributed systems. He has worked in diverse industries, including biomedical/pharma research, cloud, fintech, and e-commerce/mobile. Along the way, he had the opportunity to be granted a patent (as co-inventor) in distributed systems, publish in a top academic journal, and contribute to open source software. He has also been very active as a speaker at machine learning/tech conferences and meetups.
Read more about Natu Lauchande