Reader small image

You're reading from  Deep Learning with PyTorch Lightning

Product typeBook
Published inApr 2022
Reading LevelBeginner
PublisherPackt
ISBN-139781800561618
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Kunal Sawarkar
Kunal Sawarkar
author image
Kunal Sawarkar

Kunal Sawarkar is a chief data scientist and AI thought leader. He leads the worldwide partner ecosystem in building innovative AI products. He also serves as an advisory board member and an angel investor. He holds a master's degree from Harvard University with major coursework in applied statistics. He has been applying machine learning to solve previously unsolved problems in industry and society, with a special focus on deep learning and self-supervised learning. Kunal has led various AI product R&D labs and has 20+ patents and papers published in this field. When not diving into data, he loves doing rock climbing and learning to fly aircraft, in addition to an insatiable curiosity for astronomy and wildlife.
Read more about Kunal Sawarkar

Right arrow

Controlling training

There is often a need to have an audit, balance, and control mechanism during the training process. Imagine you are training a model for 1,000 epochs and a network failure causes an interruption after 500 epochs. How do you resume training from a certain point while ensuring that you won't lose all your progress, or save a model checkpoint from a cloud environment? Let's see how to deal with these practical challenges that are often part and parcel of an engineer's life.

Saving model checkpoints when using the cloud

Notebooks hosted in cloud environments such as Google Colab have resource limits and idle timeout periods. If these limits are exceeded during the development of a model, then the notebook is deactivated. Owing to the inherently elastic nature of the cloud environment, (which is one of the value propositions of the cloud) the underlying compute and storage resources are decommissioned when a notebook is deactivated. If you refresh...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Deep Learning with PyTorch Lightning
Published in: Apr 2022Publisher: PacktISBN-13: 9781800561618

Author (1)

author image
Kunal Sawarkar

Kunal Sawarkar is a chief data scientist and AI thought leader. He leads the worldwide partner ecosystem in building innovative AI products. He also serves as an advisory board member and an angel investor. He holds a master's degree from Harvard University with major coursework in applied statistics. He has been applying machine learning to solve previously unsolved problems in industry and society, with a special focus on deep learning and self-supervised learning. Kunal has led various AI product R&D labs and has 20+ patents and papers published in this field. When not diving into data, he loves doing rock climbing and learning to fly aircraft, in addition to an insatiable curiosity for astronomy and wildlife.
Read more about Kunal Sawarkar