Reader small image

You're reading from  Deep Learning for Time Series Cookbook

Product typeBook
Published inMar 2024
PublisherPackt
ISBN-139781805129233
Edition1st Edition
Right arrow
Authors (2):
Vitor Cerqueira
Vitor Cerqueira
author image
Vitor Cerqueira

​Vitor Cerqueira is a time series researcher with an extensive background in machine learning. Vitor obtained his Ph.D. degree in Software Engineering from the University of Porto in 2019. He is currently a Post-Doctoral researcher in Dalhousie University, Halifax, developing machine learning methods for time series forecasting. Vitor has co-authored several scientific articles that have been published in multiple high-impact research venues.
Read more about Vitor Cerqueira

Luís Roque
Luís Roque
author image
Luís Roque

Luís Roque, is the Founder and Partner of ZAAI, a company focused on AI product development, consultancy, and investment in AI startups. He also serves as the Vice President of Data & AI at Marley Spoon, leading teams across data science, data analytics, data product, data engineering, machine learning operations, and platforms. In addition, he holds the position of AI Advisor at CableLabs, where he contributes to integrating the broadband industry with AI technologies. Luís is also a Ph.D. Researcher in AI at the University of Porto's AI&CS lab and oversees the Data Science Master's program at Nuclio Digital School in Barcelona. Previously, he co-founded HUUB, where he served as CEO until its acquisition by Maersk.
Read more about Luís Roque

View More author details
Right arrow

Resampling a time series

Time series resampling is the process of changing the frequency of a time series, for example, from hourly to daily. This task is a common preprocessing step in time series analysis and this recipe shows how to do it with pandas.

Getting ready

Changing the frequency of a time series is a common preprocessing step before analysis. For example, the time series used in the preceding recipes has an hourly granularity. Yet, our goal may be to study daily variations. In such cases, we can resample the data into a different period. Resampling is also an effective way of handling irregular time series – those that are collected in irregularly spaced periods.

How to do it…

We’ll go over two different scenarios where resampling a time series may be useful: when changing the sampling frequency and when dealing with irregular time series.

The following code resamples the time series into a daily granularity:

series_daily = series.resample('D').sum()

The daily granularity is specified with the input D to the resample () method. The values of each corresponding day are summed together using the sum() method.

Most time series analysis methods work under the assumption that the time series is regular; in other words, it is collected in regularly spaced time intervals (for example, every day). But some time series are naturally irregular. For instance, the sales of a retail product occur at arbitrary timestamps as customers arrive at a store.

Let us simulate sale events with the following code:

import numpy as np
import pandas as pd
n_sales = 1000
start = pd.Timestamp('2023-01-01 09:00')
end = pd.Timestamp('2023-04-01')
n_days = (end – start).days + 1
irregular_series = pd.to_timedelta(np.random.rand(n_sales) * n_days,
                                   unit='D') + start

The preceding code creates 1000 sale events from 2023-01-01 09:00 to 2023-04-01. A sample of this series is shown in the following table:

ID

Timestamp

1

2023-01-01 15:18:10

2

2023-01-01 15:28:15

3

2023-01-01 16:31:57

4

2023-01-01 16:52:29

5

2023-01-01 23:01:24

6

2023-01-01 23:44:39

Table 1.2: Sample of an irregular time series

Irregular time series can be transformed into a regular frequency by resampling. In the case of sales, we will count how many sales occurred each day:

ts_sales = pd.Series(0, index=irregular_series)
tot_sales = ts_sales.resample('D').count()

First, we create a time series of zeros based on the irregular timestamps (ts_sales). Then, we resample this dataset into a daily frequency (D) and use the count() method to count how many observations occur each day. The tot_sales reconstructed time series can be used for other tasks, such as forecasting daily sales.

How it works…

A sample of the reconstructed time series concerning solar radiation is shown in the following table:

Datetime

Incoming Solar

2007-10-01

1381.5

2007-10-02

3953.2

2007-10-03

3098.1

2007-10-04

2213.9

Table 1.3: Solar radiation time series after resampling

Resampling is a cornerstone preprocessing step in time series analysis. This technique can be used to change a time series into a different granularity or to convert an irregular time series into a regular one.

The summary statistic is an important input to consider. In the first case, we used sum to add the hourly solar radiation values observed each day. In the case of the irregular time series, we used the count() method to count how many events occurred in each period. Yet, you can use other summary statistics according to your needs. For example, using the mean would take the average value of each period to resample the time series.

There’s more…

We resampled to daily granularity. A list of available options is available here: https://pandas.pydata.org/docs/user_guide/timeseries.html#dateoffset-objects.

Previous PageNext Page
You have been reading a chapter from
Deep Learning for Time Series Cookbook
Published in: Mar 2024Publisher: PacktISBN-13: 9781805129233
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Vitor Cerqueira

​Vitor Cerqueira is a time series researcher with an extensive background in machine learning. Vitor obtained his Ph.D. degree in Software Engineering from the University of Porto in 2019. He is currently a Post-Doctoral researcher in Dalhousie University, Halifax, developing machine learning methods for time series forecasting. Vitor has co-authored several scientific articles that have been published in multiple high-impact research venues.
Read more about Vitor Cerqueira

author image
Luís Roque

Luís Roque, is the Founder and Partner of ZAAI, a company focused on AI product development, consultancy, and investment in AI startups. He also serves as the Vice President of Data & AI at Marley Spoon, leading teams across data science, data analytics, data product, data engineering, machine learning operations, and platforms. In addition, he holds the position of AI Advisor at CableLabs, where he contributes to integrating the broadband industry with AI technologies. Luís is also a Ph.D. Researcher in AI at the University of Porto's AI&CS lab and oversees the Data Science Master's program at Nuclio Digital School in Barcelona. Previously, he co-founded HUUB, where he served as CEO until its acquisition by Maersk.
Read more about Luís Roque