Reader small image

You're reading from  Modern Time Series Forecasting with Python

Product typeBook
Published inNov 2022
PublisherPackt
ISBN-139781803246802
Edition1st Edition
Concepts
Right arrow
Author (1)
Manu Joseph
Manu Joseph
author image
Manu Joseph

Manu Joseph is a self-made data scientist with more than a decade of experience working with many Fortune 500 companies enabling digital and AI transformations, specifically in machine learning-based demand forecasting. He is considered an expert, thought leader, and strong voice in the world of time series forecasting. Currently, Manu leads applied research at Thoucentric, where he advances research by bringing cutting-edge AI technologies to the industry. He is also an active open-source contributor and developed an open-source library—PyTorch Tabular—which makes deep learning for tabular data easy and accessible. Originally from Thiruvananthapuram, India, Manu currently resides in Bengaluru, India, with his wife and son
Read more about Manu Joseph

Right arrow

Strategies for Global Deep Learning Forecasting Models

All through the last few chapters, we have been building up deep learning for time series forecasting. We started with the basics of deep learning, saw the different building blocks, practically used some of those building blocks to generate forecasts on a sample household, and finally, talked about attention and transformers. Now, let’s slightly alter our trajectory and take a look at global models for deep learning. In Chapter 10, Global Forecasting Models, we saw why global models make sense and also saw how we can use such a model in the machine learning context. We even got good results in our experiments. In this chapter, we will look at how we can apply similar concepts, but from a deep learning context. We will look at different strategies that we can use to make global deep learning models work better.

In this chapter, we will be covering these main topics:

  • Creating global deep learning forecasting models...

Technical requirements

You will need to set up the Anaconda environment following the instructions in the Preface of the book to get a working environment with all the packages and datasets required for the code in this book.

You will need to run these notebooks:

  • 02 - Preprocessing London Smart Meter Dataset.ipynb in Chapter02
  • 01-Setting up Experiment Harness.ipynb in Chapter04
  • 01-Feature Engineering.ipynb in Chapter06

The associated code for the chapter can be found at https://github.com/PacktPublishing/Modern-Time-Series-Forecasting-with-Python-/tree/main/notebooks/Chapter15.

Creating global deep learning forecasting models

In Chapter 10, Global Forecasting Models, we talked in detail about why a global model makes sense. We talked about the benefits regarding increased sample size, cross-learning, multi-task learning and the regularization effect that comes with it, and reduced engineering complexity. All of these are relevant for a deep learning model as well. Engineering complexity and sample size become even more important because deep learning models are data-hungry and take quite a bit more engineering effort and training time than other machine learning models. I would go to the extent to say that in the deep learning context, in most practical cases where we have to forecast at scale, global models are the only deep learning paradigm that makes sense.

So, why did we spend all that time looking at individual models? Well, it’s easier to grasp the concept at that level, and the skills and knowledge we gained at that level are very easily...

Using time-varying information

The GFM(ML) used all the available features. So obviously, that model had access to a lot more information than the GFM(DL) we have built till now. The GFM(DL) we just built only takes in the history and nothing else. Let’s change that by including time-varying information. We will just use time-varying real features this time because dealing with categorical features is a topic I want to leave for the next section.

We initialize the training dataset the same way as before, but we add time_varying_known_reals=feat_config.time_varying_known_reals to the initialization parameters. Now that we have all the datasets created, let’s move on to setting up the model.

To set up the model, we need to understand one concept. We are now using the history of the target and time-varying known features. In Figure 15.3, we saw how TimeSeriesDataset arranges the different kinds of variables in PyTorch tensors. In the previous section, we used only...

Using static/meta information

There are some features such as the Acorn group, whether dynamic pricing is enabled, and so on, that are specific to a household, which will help the model learn patterns specific to these groups. Naturally, including that information makes intuitive sense. But as we discussed in Chapter 10, Global Forecasting Models, categorical features do not play well with machine learning models because they aren’t numerical. In that chapter, we discussed a few ways of encoding categorical features into numerical representations. We can use any of those in a deep learning model as well. But there is one way of handling categorical features that is unique to deep learning models – embedding vectors.

One-hot encoding and why it is not ideal

One of the ways of converting categorical features to numerical representation is one-hot encoding. It encodes the categorical features in a higher dimension, placing the categorical values equally distant in...

Using the scale of the time series

We used GroupNormlizer in TimeSeriesDataset to scale each household using its own mean and standard deviation. We did this because we wanted to make the target zero mean and unit variance so that the model does not waste effort trying to change its parameters to capture the scale of individual household consumption. Although this is a good strategy, we do have some information loss here. There may be patterns that are specific to households whose consumption is on the larger side and some other patterns that are specific to households that consume much less. But now, they are both lumped in together and the model tries to learn common patterns. In such a scenario, these unique patterns seem like noise to the model because there is no variable to explain those.

The bottom line is that there is information in the scale that we removed, and adding that information back would be beneficial. So, how do we add it back? Definitely not by including the...

Balancing the sampling procedure

We saw a few strategies for improving a global deep learning model by adding new types of features. Now, let’s look at a different aspect that is relevant in a global modeling context. In an earlier section, when we were talking about global deep learning models, we talked about how the process by which we sample a window of sequence to feed to our model can be thought of as a two-step process:

  1. Sampling a time series out of a set of time series
  2. Sampling a window out of that time series

Let’s use an analogy to make the concept clearer. Imagine we have a large bowl that we have filled with balls. Each ball in the bowl represents a time series in the dataset (a household in our dataset). Now, each ball, , has chits of paper representing all the different windows of samples we can draw from it.

In the batch sampling we use by default, we open all the balls and dump all the chits into the bowl and discard the balls....

Summary

After having built a strong foundation on deep learning models in the last few chapters, we started to look at a new paradigm of global models in the context of deep learning models. We learned how to use PyTorch Forecasting, an open source library for forecasting using deep learning, and used the feature-filled TimeSeriesDataset to start developing our own models.

We started off with a very simple LSTM in the global context and saw how we can add time-varying information, static information, and the scale of individual time series to the features to make models better. We closed by looking at an alternating sampling procedure for mini-batches that helps us present a more balanced view of the problem in each batch. This chapter is by no means an exhaustive list of all such techniques to make the forecasting models better. Instead, this chapter aims to build the right kind of thinking that is necessary to work on your own models and make them work better than before.

And...

Further reading

You can check out the following sources for further reading:

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Modern Time Series Forecasting with Python
Published in: Nov 2022Publisher: PacktISBN-13: 9781803246802
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at £13.99/month. Cancel anytime

Author (1)

author image
Manu Joseph

Manu Joseph is a self-made data scientist with more than a decade of experience working with many Fortune 500 companies enabling digital and AI transformations, specifically in machine learning-based demand forecasting. He is considered an expert, thought leader, and strong voice in the world of time series forecasting. Currently, Manu leads applied research at Thoucentric, where he advances research by bringing cutting-edge AI technologies to the industry. He is also an active open-source contributor and developed an open-source library—PyTorch Tabular—which makes deep learning for tabular data easy and accessible. Originally from Thiruvananthapuram, India, Manu currently resides in Bengaluru, India, with his wife and son
Read more about Manu Joseph