Reader small image

You're reading from  Deep Learning with PyTorch Lightning

Product typeBook
Published inApr 2022
Reading LevelBeginner
PublisherPackt
ISBN-139781800561618
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Kunal Sawarkar
Kunal Sawarkar
author image
Kunal Sawarkar

Kunal Sawarkar is a chief data scientist and AI thought leader. He leads the worldwide partner ecosystem in building innovative AI products. He also serves as an advisory board member and an angel investor. He holds a master's degree from Harvard University with major coursework in applied statistics. He has been applying machine learning to solve previously unsolved problems in industry and society, with a special focus on deep learning and self-supervised learning. Kunal has led various AI product R&D labs and has 20+ patents and papers published in this field. When not diving into data, he loves doing rock climbing and learning to fly aircraft, in addition to an insatiable curiosity for astronomy and wildlife.
Read more about Kunal Sawarkar

Right arrow

Chapter 9: Deploying and Scoring Models

Without knowing it, you may have already experienced some of the models we have covered so far in this book. Recall how your photos app can automatically detect faces in your picture collections or group all your pictures with a particular friend together. That is nothing more than an image recognition Deep Learning model in action (the likes of Convolutional Neural Networks (CNNs)), or you might be familiar with Alexa listening to your voice or Google autocompleting your text while searching for a query. Those are NLP-based Deep Learning models making things easier for us. Or you might have seen some e-shopping apps or social media sites suggesting captions for a product; that is semi-supervised learning in its full glory! But how do you take a model that you have built in a Python Jupyter notebook and make it consumable on devices, be it a speaker, a phone, an app, or a portal? Without application integration, a trained model remains a statistical...

Technical requirements

The code for this chapter has been developed and tested on macOS with Anaconda or in Google Colab with Python. If you are using another environment, please make the appropriate changes to your env variables. Please ensure you have the correct version before running the code.

In this chapter, we will primarily be using the following Python modules, mentioned with their versions:

  • pytorch-lightning (version 1.5.10)
  • torch (version 1.11.0)
  • requests (version 2.27.1)
  • torchvision (version 0.12.0)
  • flask (version 2.0.2)
  • pillow (version 8.2.0)
  • numpy (version 1.21.3)
  • json (version 2.0.9)
  • onnxruntime (version 1.10.0))

Working examples for this chapter can be found at this GitHub link: https://github.com/PacktPublishing/Deep-Learning-with-PyTorch-Lightning/tree/main/Chapter09.

In order to make sure that these modules work together and not go out of sync, we have used the specific version of torch, torchvision, torchtext...

Deploying and scoring a Deep Learning model natively

Once a Deep Learning model is trained, it basically contains all the information about its structure, that is, its model weights, layers, and so on. For us to be able to use this model later in the production environment on new sets of data, we need to store this model in a suitable format. The process of converting a data object into a format that can be stored in memory is called serialization. Once a model is serialized in such a fashion, it's an autonomous entity and can be transmitted or transferred to a different operating system or a different deployment environment (such as staging or production).

However, once a model is transferred to a production environment, we must reconstruct the model parameters and weights in their original format. This process of recreation from the serialized format is called de-serialization.

There are some other ways to productionalize ML models as well, but the most commonly used method...

Deploying and scoring inter-portable models

There are so many Deep Learning frameworks available at the doorstep of a data scientist. The PyTorch Lightning framework is just the latest in a series of frameworks that includes TensorFlow, PyTorch, and even older ones such as Caffe and Torch. Each data scientist (based on what they first studied or their comfort level) normally prefers one framework over the others. Some frameworks are in Python while others are in C++. It's hard to standardize a framework in one project, let alone one department or one company. It is possible that you may train a model first in PyTorch Lightning and then, after some time, have a need to refresh it in Caffe or TensorFlow. Having a model transferred between different frameworks or an inter-portable model across frameworks and languages thus becomes essential. ONNX is one such format designed for this purpose.

In this section, we will see how we can achieve inter-portability in deployment using...

Next steps

Now that we have seen how to deploy and score a Deep Learning model, feel free to explore other challenges that sometimes accompany the consumption of models:

  • How do we scale the scoring for massive workloads, for example, serving 1 million predictions every second?
  • How do we manage the response time of scoring throughput within a certain round-trip time? For example, the round-trip between a request coming in and the score being served cannot exceed 20 milliseconds. You can also think of ways to optimize such DL models while deploying, such as batch inference and quantization.
  • Heroku is a popular option to deploy. You can deploy a simple ONNX model over Heroku under a free tier. You can deploy the model without the frontend or with a simple frontend to just upload a file. You can go a step further and use a production server, such as Uvicorn, Gunicorn, or Waitress, and try to deploy the model.
  • It is also possible to save the model as a .pt file and...

Further reading

Here is a link to the Inference in Production page of the PyTorch Lightning website: https://pytorch-lightning.readthedocs.io/en/latest/common/production_inference.html.

To learn more about ONNX and ONNX Runtime, visit their websites: https://onnx.ai and https://onnxruntime.ai.

Summary

Data scientists often play a supporting role in the model deployment and scoring aspects. However, in some companies (or smaller data science projects where there may not be a fully staffed engineering or ML-Ops team), data scientists may be asked to do such tasks. This chapter should be helpful in preparing you for doing both test and experimental deployments, as well as integration with end user applications.

We have seen in this chapter how PyTorch Lightning can be easily deployed and scored to be consumed via a REST API endpoint with the help of a Flask application. We have seen how we can do so both natively via checkpoint files or via a portable file format such as ONNX. We have seen how different file formats such as ONNX can be used to aid the deployment process in real-life production situations, where multiple teams may be using different frameworks for training the models.

Looking back, we started our journey with an introduction to our first Deep Learning...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Deep Learning with PyTorch Lightning
Published in: Apr 2022Publisher: PacktISBN-13: 9781800561618
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Kunal Sawarkar

Kunal Sawarkar is a chief data scientist and AI thought leader. He leads the worldwide partner ecosystem in building innovative AI products. He also serves as an advisory board member and an angel investor. He holds a master's degree from Harvard University with major coursework in applied statistics. He has been applying machine learning to solve previously unsolved problems in industry and society, with a special focus on deep learning and self-supervised learning. Kunal has led various AI product R&D labs and has 20+ patents and papers published in this field. When not diving into data, he loves doing rock climbing and learning to fly aircraft, in addition to an insatiable curiosity for astronomy and wildlife.
Read more about Kunal Sawarkar