Packt+ | Advance your knowledge in tech

You're reading from Practical Time Series Analysis

Product typeBook

Published inSep 2017

Reading LevelIntermediate

PublisherPackt

ISBN-139781788290227

Edition1st Edition

Languages

Python

Concepts

Data Visualization

Authors (2):

Avishek Pal

PKS Prakash

View More author details

Chapter 2. Understanding Time Series Data

In the previous chapter, we touched upon a general approach of time series analysis which consists of two main steps:

Data visualization to check the presence of trend, seasonality, and cyclical patterns
Adjustment of trend and seasonality to generate stationary series

Generating stationary data is important for enhancing the time series forecasting model. Deduction of the trend, seasonal, and cyclical components would leave us with irregular fluctuations which cannot be modeled by using only the time index as an explanatory variable. Therefore, in order to further improve forecasting, the irregular fluctuations are assumed to be independent and identically distributed (iid) observations and modeled by a linear regression on variables other than the time index.

For example, house prices might exhibit both trend and seasonal (for example, quarterly) variations. However, the residuals left after adjusting trend and seasonality might actually depend on...

Advanced processing and visualization of time series data

In many cases, the original time series needs to be transformed into aggregate statistics. For example, observations in the original time series might have been recorded at every second; however, in order to perform any meaningful analysis, data must be aggregated every minute. This would need resampling the observations over periods that are longer than the granular time indices in the original data. The aggregate statistics, such as mean, median, and variance, is calculated for each of the longer periods of time.

Another example of data pre-processing for time series, is computing aggregates over similar segments in the data. Consider the monthly sales of cars manufactured by company X where the data exhibits monthly seasonality, due to which sales during a month of a given year shows patters similar to the sales of the same month in the previous and next years. To highlight this kind of seasonality we must remove the long-run trend...

Resampling time series data

The technique of resmapling is illustrated using a time series on chemical concentration readings taken every two hours between 1^st January 1975 and 17^th January 1975. The dataset has been downloaded from http://datamarket.com and is also available in the datasets folder of this book's GitHub repo.

We start by importing the packages required for running this example:

from __future__ import print_function 
import os 
import pandas as pd 
import numpy as np 
%matplotlib inline 
from matplotlib import pyplot as plt

Then we set the working directory as follows:

os.chdir('D:/Practical Time Series')

This is followed by reading the data from the CSV file in a pandas.DataFrame and displaying shape and the first 10 rows of the DataFrame:

df = pd.read_csv('datasets/chemical-concentration-readings.csv') 
print('Shape of the dataset:', df.shape) 
df.head(10)

The preceding code returns the following output:

Shape of the dataset: (197, 2)

Stationary processes

Properties of data such as central tendency, dispersion, skewness, and kurtosis are called sample statistics. Mean and variance are two of the most commonly used sample statistics. In any analysis, data is collected by gathering information from a sample of the larger population. Mean, variance, and other properties are then estimated based on the sample data. Hence these are referred to as sample statistics.

An important assumption in statistical estimation theory is that, for sample statistics to be reliable, the population does not undergo any fundamental or systemic shifts over the individuals in the sample or over the time during which the data has been collected. This assumption ensures that sample statistics do not alter and will hold for entities that are outside the sample used for their estimation.

This assumption also applies to time series analysis so that mean, variance and auto-correlation estimated from the simple can be used as a reasonable estimate for...

Time series decomposition

The objective of time series decomposition is to model the long-term trend and seasonality and estimate the overall time series as a combination of them. Two popular models for time series decomposition are:

Additive model
Multiplicative model

The additive model formulates the original time series (x_t) as the sum of the trend cycle (F_t) and seasonal (S_t) components as follows:

x_t = F_t + S_t + Є_t

The residuals Є_t obtained after adjusting the trend and seasonal components are the irregular variations. The additive model is usually applied when there is a time-dependent trend cycle component, but independent seasonality that does not change over time.

The multiplicative decomposition model, which gives the time series as product of the trend, seasonal, and irregular components is useful when there is time-varying seasonality:

xt = F_t x S_t x Є_t

By taking logarithm, the multiplicative model is converted to an additive model of logarithm of the individual components. The multiplicative...

Summary

We started this chapter by discussing advanced data processing techniques such as resampling, group-by, and moving window computations to obtain aggregate statistics from a time series. Next, we described stationary time series and discussed statistical tests of hypothesis such as Ljung-Box test and Augmented Dickey Fuller test to verify stationarity of a time series. Stationarizing non-stationary time series is important for time series forecasting. Therefore, we discussed two different approaches of stationarizing time series. Firstly, the method of differencing, which covers first, second, and seasonal differencing, has been described for stationarizing a non-stationary time series. Secondly, time series decomposition using the statsmodels.tsa API for additive and multiplicative models has been discussed. In the next chapter, we delve deeper in techniques of exponential smoothing which deals with noisy time series data.

The rest of the chapter is locked

You have been reading a chapter from

Practical Time Series Analysis

Published in: Sep 2017Publisher: PacktISBN-13: 9781788290227

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Avishek Pal

Dr. Avishek Pal, PhD, is a software engineer, data scientist, author, and an avid Kaggler living in Hyderabad, India. He achieved his Bachelor of Technology degree in industrial engineering from the Indian Institute of Technology (IIT) Kharagpur and earned his doctorate in 2015 from University of Warwick, Coventry, United Kingdom. He started his career as a software engineer at IBM India developing middleware solutions for telecom clients. This was followed by stints at a start-up product development company followed by Ericsson, the global telecom giant. After doctoral studies, Avishek started his career in India as a lead machine learning engineer for a leading US-based investment company. He is currently working at Microsoft as a senior data scientist. Avishek has published several research papers in reputed international conferences and journals.
Read more about Avishek Pal

PKS Prakash

Dr. PKS Prakash is a data scientist and author. He has spent the last 12 years in developing many data science solutions in several practical areas in healthcare, manufacturing, pharmaceuticals, and e-commerce. He currently works as the data science manager at ZS Associates. He is the co-founder of Warwick Analytics, a spin-off from University of Warwick, UK. Prakash has published articles widely in research areas of operational research and management, soft computing tools, and advanced algorithms in leading journals such as IEEE-Trans, EJOR, and IJPR, among others. He has edited an article on Intelligent Approaches to Complex Systems and contributed to books such as Evolutionary Computing in Advanced Manufacturing published by WILEY and Algorithms and Data Structures using R and R Deep Learning Cookbook, published by PACKT.
Read more about PKS Prakash

Other recommended products

Related to this chapter

Hands-On Time Series Analysis with R

This book introduces you to time series analysis and forecasting with R; this is one of the key fields in statistical programming and includes techniques for analyzing data to extract meaningful insights. You will explore methods, such as prediction with time series analysis, and identify the relationship between each data point in the series.

BookMay 2019448 pages

Learning Quantitative Finance with R

This book covers applications of quantitative finance in R. It starts with the basics of quantitative finance and goes to complexity at the end of the book along with a varying degree of R complexity. This will guide you to implement different trading strategies for various financial instruments using basic to complex techniques along with its optimization and keeping the risk of financial instruments in check.

BookMar 2017284 pages

Hands-On Financial Trading with Python

This book focuses on key Python analytics and algorithmic trading libraries used for backtesting. With the help of practical examples, you will learn the principle aspects of trading strategy development. The 14 profitable strategies included in the book will also help you build intuitions that will enable you to create your own strategy.

BookApr 2021360 pages

Machine Learning Quick Reference

Machine learning involves development and training of models used to predict future outcomes. This book is a practical guide to all the tips and tricks related to machine learning. It includes hands-on, easy to access techniques on topics like model selection, performance tuning, training neural networks, time series analysis and a lot more.

BookJan 2019294 pages

Python for Finance Cookbook

Python is becoming the number one language for data science and also quantitative finance. This book provides you with solutions to common tasks from the intersection of quantitative finance and data science, using modern Python libraries.

BookJan 2020432 pages

Machine Learning With Go

Machine Learning With Go, Second edition develops the reader to build productive, innovative machine learning systems by leveraging Go and its popular machine learning packages. You will learn regression, classification, clustering, use of neural networks in predictive models and different types of time series and unstructured datasets

BookApr 2019328 pages

Mastering Predictive Analytics with scikit-learn and TensorFlow

In this book, you will find a range of methods to improve the performance of almost any predictive model, from ensemble methods to dimensionality reduction and cross-validation. You will learn the tools to produce advanced predictive models. In addition, you will dive into the exiting field of Deep Learning using TensorFlow.

BookSep 2018154 pages

Machine Learning With Go

The mission of this book is to turn readers into productive, innovative data analysts who leverage Go to build robust and valuable applications. To this end, the book clearly introduces the technical aspects of building predictive models in Go, but it also helps the reader understand how machine learning workflows are being applied in real world scenarios Data scientists and analysts are unfortunately known for producing bad, inefficient, and unmaintainable code. This book will address this issue, and will clearly show readers how to be productive with machine learning while also producing application maintaining a high level of integrity. It will also allow readers to overcome the common challenges of integrating analysis and machine learning code within an existing engineering organization.

BookSep 2017304 pages

Hands-On Exploratory Data Analysis with Python

This book provides practical knowledge about the main pillars of EDA including data cleaning, data preparation, data exploration, and data visualization. You can leverage the power of Python to understand, summarize and investigate your data in the best way possible. The book presents a unique approach to exploring hidden features in your data.

BookMar 2020352 pages

SAS for Finance

SAS is the ground-breaking tool for advanced, predictive, and statistical analytics. Right from refining your data using power of SAS analytics, you will be able to exploit the capabilities of high-powered package to create accurate financial models. You can easily assess the pros and cons of models to suit unique business needs.

BookMay 2018306 pages

Hands-On Deep Learning with TensorFlow

With deep learning going mainstream, making sense of data and getting accurate results using deep networks is possible. Dan Van Boxel is your guide to exploring the possibilities with deep learning; he will enable you to understand data like never before. With the efficiency and simplicity of TensorFlow, you will be able to process your data and gain insights that will change how you look at data.

BookJul 2017174 pages

Forecasting Time Series Data with Facebook Prophet

This book will help you get to grips with the task of time series forecasting using the leading open source forecasting tool available to the public, Facebook Prophet. You will learn how to implement the advanced features of Prophet to build forecast models and understand why and how to modify each of the default parameters to improve results.

BookMar 2021270 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

	Timestamp	Chemical conc.
0	1975-01-01 00:00...