You're reading from Practical Deep Learning at Scale with MLflow

Product typeBook

Published inJul 2022

PublisherPackt

ISBN-139781803241333

Edition1st Edition

Concepts

Deep Learning

Author (1)

Yong Liu

Chapter 6: Running Hyperparameter Tuning at Scale

Hyperparameter tuning or hyperparameter optimization (HPO) is a procedure that finds the best possible deep neural network structures, types of pretrained models, and model training process within a reasonable computing resource constraint and time frame. Here, hyperparameter refers to parameters that cannot be changed or learned during the ML training process, such as the number of layers inside a deep neural network, the choice of a pretrained language model, or the learning rate, batch size, and optimizer of the training process. In this chapter, we will use HPO as a shorthand to refer to the process of hyperparameter tuning and optimization. HPO is a critical step for producing a high-performance ML/DL model. Given that the search space of the hyperparameter is very large, efficiently running HPO at scale is a major challenge. The complexity and high cost of evaluating a DL model, compared to classical ML models, further compound...

Technical requirements

To understand the examples in this chapter, the following key technical requirements are needed:

Ray Tune 1.9.2: This is a flexible and powerful hyperparameter tuning framework (https://docs.ray.io/en/latest/tune/index.html).
Optuna 2.10.0: This is an imperative and define-by-run hyperparameter tuning Python package (https://optuna.org/).
The code for this chapter can be found in the following GitHub URL, which also includes the requirements.txt file that contains the preceding key packages and other dependencies: https://github.com/PacktPublishing/Practical-Deep-Learning-at-Scale-with-MLFlow/tree/main/chapter06.

Understanding automatic HPO for DL pipelines

Automatic HPO has been studied for over two decades since the first known paper on this topic was published in 1995 (https://www.sciencedirect.com/science/article/pii/B9781558603776500451). It has been widely understood that tuning hyperparameters for an ML model can improve the performance of the model – sometimes, dramatically. The rise of DL models in recent years has triggered a new wave of innovation and the development of new frameworks to tackle HPO for DL pipelines. This is because a DL model pipeline imposes many new and large-scale optimization challenges that cannot be easily solved by previous HPO methods. Note that, in contrast to the model parameters that can be learned during the model training process, a set of hyperparameters must be set before training.

Difference between HPO and Transfer Learning's Fine-Tuning

In this book, we have been focusing on one successful DL approach called Transfer Learning...

Creating HPO-ready DL models with Ray Tune and MLflow

To use Ray Tune with MLflow for HPO, let's use the fine-tuning step in our DL pipeline example from Chapter 5, Running DL Pipelines in Different Environments, to see what needs to be set up and what code changes we need to make. Before we start, first, let's review a few key concepts that are specifically relevant to our usage of Ray Tune:

Objective function: An objective function can be either to minimize or maximize some metric values for a given configuration of hyperparameters. For example, in the DL model training and fine-tuning scenarios, we would like to maximize the F1-score for the accuracy of an NLP text classifier. This objective function needs to be wrapped as a trainable function, where Ray Tune can do HPO. In the following section, we will illustrate how to wrap our NLP text sentiment model.
Function-based APIs and class-based APIs: A function-based API allows a user to insert Ray Tune statements...

Running the first Ray Tune HPO experiment with MLflow

Now that we have set up Ray Tune, MLflow, and created the HPO run function, we can try to run our first Ray Tune HPO experiment, as follows:

python pipeline/hpo_finetuning_model.py

After a couple of seconds, you will see the following screen, Figure 6.2, which shows that all 10 trials (that is, the values that we set for num_samples) are running concurrently:

Figure 6.2 – Ray Tune running 10 trials in parallel on a local multi-core laptop

After approximately 12–14 mins, you will see that all the trials have finished and the best hyperparameters will be printed out on the screen, as shown in the following (your results might vary due to the stochastic nature, the limited number of samples, and the use of grid search, which does not guarantee a global optimal):

Best hyperparameters found were: {'lr': 0.025639008922511797, 'batch_size': 64, 'foundation_model&apos...

Running HPO with Ray Tune using Optuna and HyperBand

Now, let's do some experiments with different search algorithms and schedulers. Given that Optuna is such a great TPE-based search algorithm, and ASHA is a great scheduler that does asynchronous parallel trials with early termination of the unpromising ones, it would be interesting to see how many changes we need to do to make this work.

It turns out the change is very minimal based on what we have already done in the previous section. Here, we will illustrate the four main changes:

Install the Optuna package. This can be done by running the following command:
```
pip install optuna==2.10.0
```

This will install Optuna in the same virtual environment that we had before. If you have already run pip install -r requirements.text, then Optuna has already been installed and you can skip this step.

Import the relevant Ray Tune modules that integrate with Optuna and the ASHA scheduler (here, we use the HyperBand implementation...

Summary

In this chapter, we covered the fundamentals and challenges of HPO, why it is important for the DL model pipeline, and what a modern HPO framework should support. We compared three popular frameworks – Ray Tune, Optuna, and HyperOpt – and picked Ray Tune as the winner for running state-of-the-art HPO at scale. We saw how to create HPO-ready DL model code using Ray Tune and MLflow and ran our first HPO experiment with Ray Tune and MLflow. Additionally, we covered how to switch to other search and scheduler algorithms once we have our HPO code framework set up, using the Optuna and HyperBand schedulers as an example. The learnings from this chapter will help you to competently carry out large-scale HPO experiments in real-life production environments, allowing you to produce high-performance DL models in a cost-effective way. We have also provided many references in the Further reading section at the end of this chapter to encourage you to study further.

In our...

Best Tools for Model Tuning and Hyperparameter Optimization: https://neptune.ai/blog/best-tools-for-model-tuning-and-hyperparameter-optimization
Comparison between Optuna and HyperOpt: https://neptune.ai/blog/optuna-vs-hyperopt
How (Not) to Tune Your Model with Hyperopt: https://databricks.com/blog/2021/04/15/how-not-to-tune-your-model-with-hyperopt.html
Why Hyper parameter tuning is important for your model?: https://medium.com/analytics-vidhya/why-hyper-parameter-tuning-is-important-for-your-model-1ff4c8f145d3
The Art of Hyperparameter Tuning in Deep Neural Nets by Example: https://towardsdatascience.com/the-art-of-hyperparameter-tuning-in-deep-neural-nets-by-example-685cb5429a38
Automated Hyperparameter tuning: https://insaid.medium.com/automated-hyperparameter-tuning-988b5aeb7f2a
Get better at building PyTorch models with Lightning and Ray Tune: https://towardsdatascience.com/get-better-at-building-pytorch-models-with-lightning-and-ray...

The rest of the chapter is locked

You have been reading a chapter from

Practical Deep Learning at Scale with MLflow

Published in: Jul 2022Publisher: PacktISBN-13: 9781803241333

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Yong Liu

Yong Liu has been working in big data science, machine learning, and optimization since his doctoral student years at the University of Illinois at Urbana-Champaign (UIUC) and later as a senior research scientist and principal investigator at the National Center for Supercomputing Applications (NCSA), where he led data science R&D projects funded by the National Science Foundation and Microsoft Research. He then joined Microsoft and AI/ML start-ups in the industry. He has shipped ML and DL models to production and has been a speaker at the Spark/Data+AI summit and NLP summit. He has recently published peer-reviewed papers on deep learning, linked data, and knowledge-infused learning at various ACM/IEEE conferences and journals.
Read more about Yong Liu

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages