Reader small image

You're reading from  Deep Learning for Time Series Cookbook

Product typeBook
Published inMar 2024
PublisherPackt
ISBN-139781805129233
Edition1st Edition
Right arrow
Authors (2):
Vitor Cerqueira
Vitor Cerqueira
author image
Vitor Cerqueira

​Vitor Cerqueira is a time series researcher with an extensive background in machine learning. Vitor obtained his Ph.D. degree in Software Engineering from the University of Porto in 2019. He is currently a Post-Doctoral researcher in Dalhousie University, Halifax, developing machine learning methods for time series forecasting. Vitor has co-authored several scientific articles that have been published in multiple high-impact research venues.
Read more about Vitor Cerqueira

Luís Roque
Luís Roque
author image
Luís Roque

Luís Roque, is the Founder and Partner of ZAAI, a company focused on AI product development, consultancy, and investment in AI startups. He also serves as the Vice President of Data & AI at Marley Spoon, leading teams across data science, data analytics, data product, data engineering, machine learning operations, and platforms. In addition, he holds the position of AI Advisor at CableLabs, where he contributes to integrating the broadband industry with AI technologies. Luís is also a Ph.D. Researcher in AI at the University of Porto's AI&CS lab and oversees the Data Science Master's program at Nuclio Digital School in Barcelona. Previously, he co-founded HUUB, where he served as CEO until its acquisition by Maersk.
Read more about Luís Roque

View More author details
Right arrow

Deep Learning for Time Series Anomaly Detection

In this chapter, we’ll delve into anomaly detection problems using time series data. This task involves detecting rare observations that are significantly different from most samples in a dataset. We’ll explore different approaches to tackle this problem, such as prediction-based methods or reconstruction-based methods. This includes using powerful methods such as autoencoders (AEs), variational AEs (VAEs), or generative adversarial networks (GANs).

By the end of this chapter, you’ll be able to define time series anomaly detection problems using different approaches with Python.

The chapter covers the following recipes:

  • Time series anomaly detection with Autoregressive Integrated Moving Average (ARIMA)
  • Prediction-based anomaly detection using deep learning (DL)
  • Anomaly detection using a long short-term memory (LSTM) AE
  • Building an AE using PyOD
  • Creating a VAE for time series anomaly...

Technical requirements

The models developed in this chapter are based on different frameworks. First, we show how to develop prediction-based methods using the statsforecast and neuralforecast libraries. Other methods, such as an LSTM AE, will be explored using the PyTorch Lightning ecosystem. Finally, we’ll also use the PyOD library to create anomaly detection models based on approaches such as GANs or VAEs. Of course, we also rely on typical data manipulation libraries such as pandas or NumPy. The following list contains all the required libraries for this chapter:

  • scikit-learn (1.3.2)
  • pandas (2.1.3)
  • NumPy (1.26.2)
  • statsforecast (1.6.0)
  • datasetsforecast (0.08)
  • 0neuralforecast (1.6.4)
  • torch (2.1.1)
  • PyTorch Lightning (2.1.2)
  • PyTorch Forecasting (1.0.0)
  • PyOD (1.1.2)

The code and datasets used in this chapter can be found at the following GitHub URL: https://github.com/PacktPublishing/Deep-Learning-for-Time-Series-Data-Cookboo...

Time series anomaly detection with ARIMA

Time series anomaly detection is an important task in application domains such as healthcare or manufacturing, among many others. Anomaly detection methods aim to identify observations that do not conform to the typical behavior of a dataset. In practice, anomalies can represent phenomena such as faults in machinery or fraudulent activity. Anomaly detection is a common task in machine learning (ML), and it has a few dedicated methods when it involves time series data. This type of dataset and the patterns therein can evolve over time, which complicates the modeling process and the effectiveness of the detectors. Statistical learning methods for time series anomaly detection problems usually follow a prediction-based approach or a reconstruction-based approach. In this recipe, we describe how to use an ARIMA method to create a prediction-based anomaly detection system for univariate time series.

Getting ready

We’ll focus on a univariate...

Prediction-based anomaly detection using DL

We continue to explore prediction-based methods in this recipe. This time, we’ll create a forecasting model based on DL. Besides, we’ll use the point forecasts’ error as a reference for detecting anomalies.

Getting ready

We’ll use a time series dataset about the number of taxi trips in New York City. This dataset is considered a benchmark problem for time series anomaly detection tasks. You can check the source at the following link: https://databank.illinois.edu/datasets/IDB-9610843.

Let’s start by loading the time series using pandas:

from datetime import datetime
import pandas as pd
dataset = pd.read_csv('assets/datasets/taxi/taxi_data.csv')
labels = pd.read_csv('assets/datasets/taxi/taxi_labels.csv')
dataset['ds'] = pd.Series([datetime.fromtimestamp(x) 
    for x in dataset['timestamp']])
dataset = dataset.drop('timestamp&apos...

Anomaly detection using an LSTM AE

In this recipe, we’ll build an AE to detect anomalies in time series. An AE is a type of neural network (NN) that tries to reconstruct the input data. The motivation to use this kind of model for anomaly detection is that the reconstruction process of anomalous data is more difficult than that of typical observations.

Getting ready

We’ll continue with the New York City taxi time series in this recipe. In terms of framework, we’ll show how to build an AE using PyTorch Lightning. This means that we’ll build a data module to handle the data preprocessing and another module for handling the training and inference of the NN.

How to do it…

This recipe is split into three parts. First, we build the data module based on PyTorch. Then, we create an AE module. Finally, we combine the two parts to build an anomaly detection system:

  1. Let’s start by building the data module. We create a class called...

Building an AE using PyOD

PyOD is a Python library that is devoted to anomaly detection. It contains several reconstruction-based algorithms such as AEs. In this recipe, we’ll build an AE using PyOD to detect anomalies in time series.

Getting ready

You can install PyOD using the following command:

pip install pyod

We’ll use the same dataset as in the previous recipe. So, we start with the dataset object created in the Prediction-based anomaly detection using DL recipe. Let’s see how to transform this data to build an AE with PyOD.

How to do it…

The following steps show how to build an AE and predict the probability of anomalies:

  1. We start by transforming the time series using a sliding window with the following code:
    import pandas as pd
    from sklearn.preprocessing import StandardScaler
    N_LAGS = 144
    series = dataset['y']
    input_data = []
    for i in range(N_LAGS, series.shape[0]):
        input_data.append(series...

Creating a VAE for time series anomaly detection

Building on the foundation laid in the previous recipe, we now turn our attention to VAEs, a more sophisticated and probabilistic approach to anomaly detection in time series data. Unlike traditional AEs, VAEs introduce a probabilistic interpretation, making them more adept at handling inherent uncertainties in real-world data.

Getting ready

This code in this recipe is based on PyOD. We also use the same dataset as in the previous recipe:

N_LAGS = 144
series = dataset['y']

Now, let’s see how to create a VAE for time series anomaly detection.

How to do it…

We begin by preparing our dataset, as in the previous recipe:

  1. The dataset is first transformed using a sliding window, a technique that helps the model understand temporal dependencies within the time series:
    import pandas as pd
    from sklearn.preprocessing import StandardScaler
    import numpy as np
    input_data = []
    for i in range(N_LAGS, series...

Using GANs for time series anomaly detection

GANs have gained significant popularity in various fields of ML, particularly in image generation and modification. However, their application in time series data, especially for anomaly detection, is an emerging area of research and practice. In this recipe, we focus on utilizing GANs, specifically Anomaly Detection with Generative Adversarial Networks (AnoGAN), to detect time series data anomalies.

Getting ready…

Before diving into the implementation, ensure that you have the PyOD library installed. We will continue using the taxi trip dataset for this recipe, which provides a real-world context for time series anomaly detection.

How to do it…

The implementation involves several steps: data preprocessing, defining and training the AnoGAN model, and finally, performing anomaly detection:

  1. We start by loading the dataset and preparing it for the AnoGAN model. The dataset is transformed in the same way as...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Deep Learning for Time Series Cookbook
Published in: Mar 2024Publisher: PacktISBN-13: 9781805129233
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Vitor Cerqueira

​Vitor Cerqueira is a time series researcher with an extensive background in machine learning. Vitor obtained his Ph.D. degree in Software Engineering from the University of Porto in 2019. He is currently a Post-Doctoral researcher in Dalhousie University, Halifax, developing machine learning methods for time series forecasting. Vitor has co-authored several scientific articles that have been published in multiple high-impact research venues.
Read more about Vitor Cerqueira

author image
Luís Roque

Luís Roque, is the Founder and Partner of ZAAI, a company focused on AI product development, consultancy, and investment in AI startups. He also serves as the Vice President of Data & AI at Marley Spoon, leading teams across data science, data analytics, data product, data engineering, machine learning operations, and platforms. In addition, he holds the position of AI Advisor at CableLabs, where he contributes to integrating the broadband industry with AI technologies. Luís is also a Ph.D. Researcher in AI at the University of Porto's AI&CS lab and oversees the Data Science Master's program at Nuclio Digital School in Barcelona. Previously, he co-founded HUUB, where he served as CEO until its acquisition by Maersk.
Read more about Luís Roque