Reader small image

You're reading from  Practical Deep Learning at Scale with MLflow

Product typeBook
Published inJul 2022
PublisherPackt
ISBN-139781803241333
Edition1st Edition
Right arrow
Author (1)
Yong Liu
Yong Liu
author image
Yong Liu

Yong Liu has been working in big data science, machine learning, and optimization since his doctoral student years at the University of Illinois at Urbana-Champaign (UIUC) and later as a senior research scientist and principal investigator at the National Center for Supercomputing Applications (NCSA), where he led data science R&D projects funded by the National Science Foundation and Microsoft Research. He then joined Microsoft and AI/ML start-ups in the industry. He has shipped ML and DL models to production and has been a speaker at the Spark/Data+AI summit and NLP summit. He has recently published peer-reviewed papers on deep learning, linked data, and knowledge-infused learning at various ACM/IEEE conferences and journals.
Read more about Yong Liu

Right arrow

Chapter 6: Running Hyperparameter Tuning at Scale

Hyperparameter tuning or hyperparameter optimization (HPO) is a procedure that finds the best possible deep neural network structures, types of pretrained models, and model training process within a reasonable computing resource constraint and time frame. Here, hyperparameter refers to parameters that cannot be changed or learned during the ML training process, such as the number of layers inside a deep neural network, the choice of a pretrained language model, or the learning rate, batch size, and optimizer of the training process. In this chapter, we will use HPO as a shorthand to refer to the process of hyperparameter tuning and optimization. HPO is a critical step for producing a high-performance ML/DL model. Given that the search space of the hyperparameter is very large, efficiently running HPO at scale is a major challenge. The complexity and high cost of evaluating a DL model, compared to classical ML models, further compound...

Technical requirements

To understand the examples in this chapter, the following key technical requirements are needed:

Understanding automatic HPO for DL pipelines

Automatic HPO has been studied for over two decades since the first known paper on this topic was published in 1995 (https://www.sciencedirect.com/science/article/pii/B9781558603776500451). It has been widely understood that tuning hyperparameters for an ML model can improve the performance of the model – sometimes, dramatically. The rise of DL models in recent years has triggered a new wave of innovation and the development of new frameworks to tackle HPO for DL pipelines. This is because a DL model pipeline imposes many new and large-scale optimization challenges that cannot be easily solved by previous HPO methods. Note that, in contrast to the model parameters that can be learned during the model training process, a set of hyperparameters must be set before training.

Difference between HPO and Transfer Learning's Fine-Tuning

In this book, we have been focusing on one successful DL approach called Transfer Learning...

Creating HPO-ready DL models with Ray Tune and MLflow

To use Ray Tune with MLflow for HPO, let's use the fine-tuning step in our DL pipeline example from Chapter 5, Running DL Pipelines in Different Environments, to see what needs to be set up and what code changes we need to make. Before we start, first, let's review a few key concepts that are specifically relevant to our usage of Ray Tune:

  • Objective function: An objective function can be either to minimize or maximize some metric values for a given configuration of hyperparameters. For example, in the DL model training and fine-tuning scenarios, we would like to maximize the F1-score for the accuracy of an NLP text classifier. This objective function needs to be wrapped as a trainable function, where Ray Tune can do HPO. In the following section, we will illustrate how to wrap our NLP text sentiment model.
  • Function-based APIs and class-based APIs: A function-based API allows a user to insert Ray Tune statements...

Running the first Ray Tune HPO experiment with MLflow

Now that we have set up Ray Tune, MLflow, and created the HPO run function, we can try to run our first Ray Tune HPO experiment, as follows:

python pipeline/hpo_finetuning_model.py

After a couple of seconds, you will see the following screen, Figure 6.2, which shows that all 10 trials (that is, the values that we set for num_samples) are running concurrently:

Figure 6.2 – Ray Tune running 10 trials in parallel on a local multi-core laptop

After approximately 12–14 mins, you will see that all the trials have finished and the best hyperparameters will be printed out on the screen, as shown in the following (your results might vary due to the stochastic nature, the limited number of samples, and the use of grid search, which does not guarantee a global optimal):

Best hyperparameters found were: {'lr': 0.025639008922511797, 'batch_size': 64, 'foundation_model&apos...

Running HPO with Ray Tune using Optuna and HyperBand

Now, let's do some experiments with different search algorithms and schedulers. Given that Optuna is such a great TPE-based search algorithm, and ASHA is a great scheduler that does asynchronous parallel trials with early termination of the unpromising ones, it would be interesting to see how many changes we need to do to make this work.

It turns out the change is very minimal based on what we have already done in the previous section. Here, we will illustrate the four main changes:

  1. Install the Optuna package. This can be done by running the following command:
    pip install optuna==2.10.0

This will install Optuna in the same virtual environment that we had before. If you have already run pip install -r requirements.text, then Optuna has already been installed and you can skip this step.

  1. Import the relevant Ray Tune modules that integrate with Optuna and the ASHA scheduler (here, we use the HyperBand implementation...

Summary

In this chapter, we covered the fundamentals and challenges of HPO, why it is important for the DL model pipeline, and what a modern HPO framework should support. We compared three popular frameworks – Ray Tune, Optuna, and HyperOpt – and picked Ray Tune as the winner for running state-of-the-art HPO at scale. We saw how to create HPO-ready DL model code using Ray Tune and MLflow and ran our first HPO experiment with Ray Tune and MLflow. Additionally, we covered how to switch to other search and scheduler algorithms once we have our HPO code framework set up, using the Optuna and HyperBand schedulers as an example. The learnings from this chapter will help you to competently carry out large-scale HPO experiments in real-life production environments, allowing you to produce high-performance DL models in a cost-effective way. We have also provided many references in the Further reading section at the end of this chapter to encourage you to study further.

In our...

Further reading

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Practical Deep Learning at Scale with MLflow
Published in: Jul 2022Publisher: PacktISBN-13: 9781803241333
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Yong Liu

Yong Liu has been working in big data science, machine learning, and optimization since his doctoral student years at the University of Illinois at Urbana-Champaign (UIUC) and later as a senior research scientist and principal investigator at the National Center for Supercomputing Applications (NCSA), where he led data science R&D projects funded by the National Science Foundation and Microsoft Research. He then joined Microsoft and AI/ML start-ups in the industry. He has shipped ML and DL models to production and has been a speaker at the Spark/Data+AI summit and NLP summit. He has recently published peer-reviewed papers on deep learning, linked data, and knowledge-infused learning at various ACM/IEEE conferences and journals.
Read more about Yong Liu