You're reading from Deep Learning for Time Series Cookbook

Product type Book

Published in Mar 2024

Publisher Packt

ISBN-13 9781805129233

Pages 274 pages

Edition 1st Edition

Languages

Concepts

Data Governance

Authors (2):

Vitor Cerqueira

Luís Roque

View More author details

Table of Contents (12) Chapters

Preface

Chapter 1: Getting Started with Time Series

Chapter 2: Getting Started with PyTorch

Chapter 3: Univariate Time Series Forecasting

Chapter 4: Forecasting with PyTorch Lightning

Chapter 5: Global Forecasting Models

Chapter 6: Advanced Deep Learning Architectures for Time Series Forecasting

Chapter 7: Probabilistic Time Series Forecasting

Chapter 8: Deep Learning for Time Series Classification

Chapter 9: Deep Learning for Time Series Anomaly Detection

Index

Why subscribe?

Other Books You May Enjoy

Dealing with heteroskedasticity

In this recipe, we delve into the variance of time series. The variance of a time series is a measure of how spread out the data is and how this dispersion evolves over time. You’ll learn how to handle data with a changing variance.

Getting ready

The variance of time series can change over time, which also violates stationarity. In such cases, the time series is referred to as heteroskedastic and usually shows a long-tailed distribution. This means the data is left- or right-skewed. This condition is problematic because it impacts the training of neural networks and other models.

How to do it…

Dealing with non-constant variance is a two-step process. First, we use statistical tests to check whether a time series is heteroskedastic. Then, we use transformations such as the logarithm to stabilize the variance.

We can detect heteroskedasticity using statistical tests such as the White test or the Breusch-Pagan test. The following code implements these tests based on the statsmodels library:

import statsmodels.stats.api as sms
from statsmodels.formula.api import ols
series_df = series_daily.reset_index(drop=True).reset_index()
series_df.columns = ['time', 'value']
series_df['time'] += 1
olsr = ols('value ~ time', series_df).fit()
_, pval_white, _, _ = sms.het_white(olsr.resid, olsr.model.exog)
_, pval_bp, _, _ = sms.het_breuschpagan(olsr.resid, olsr.model.exog)

The preceding code follows these steps:

Import the statsmodels modules ols and stats.
Create a DataFrame based on the values of the time series and the row they were collected at (1 for the first observation).
Create a linear model that relates the values of the time series with the time column.
Run het_white (White) and het_breuschpagan (Breusch-Pagan) to apply the variance tests.

The output of the tests is a p-value, where the null hypothesis posits that the time series has constant variance. So, if the p-value is below the significance value, we reject the null hypothesis and assume heteroskedasticity.

The simplest way to deal with non-constant variance is by transforming the data using the logarithm. This operation can be implemented as follows:

import numpy as np
class LogTransformation:
    @staticmethod
    def transform(x):
        xt = np.sign(x) * np.log(np.abs(x) + 1)
        return xt
    @staticmethod
    def inverse_transform(xt):
        x = np.sign(xt) * (np.exp(np.abs(xt)) - 1)
        return x

The preceding code is a Python class called LogTransformation. It contains two methods: transform() and inverse_transform(). The first transforms the data using the logarithm and the second reverts that operation.

We apply the transform() method to the time series as follows:

series_log = LogTransformation.transform(series_daily)

The log is a particular case of Box-Cox transformation that is available in the scipy library. You can implement this method as follows:

series_transformed, lmbda = stats.boxcox(series_daily)

The stats.boxcox() method estimates a transformation parameter, lmbda, which can be used to revert the operation.

How it works…

The transformations outlined in this recipe stabilize the variance of a time series. They also bring the data distribution closer to the Normal distribution. These transformations are especially useful for neural networks as they help avoid saturation areas. In neural networks, saturation occurs when the model becomes insensitive to different inputs, thus compromising the training process.

There’s more…

The Yeo-Johnson power transformation is similar to the Box-Cox transformation but allows for negative values in the time series. You can learn more about this method with the following link: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.yeojohnson.html.