You're reading from Building Statistical Models in Python

Product typeBook

Published inAug 2023

Reading LevelIntermediate

PublisherPackt

ISBN-139781804614280

Edition1st Edition

Languages

Python

Concepts

Statistics

Authors (3):

Huy Hoang Nguyen

Paul N Adams

Stuart J Miller

View More author details

Introduction to Time Series

In Chapter 9, Discriminant Analysis, we concluded our overview of statistical classification modeling by introducing conditional probability using Bayes’ theorem, Linear Discriminant Analysis (LDA), and Quadratic Discriminant Analysis (QDA). In this chapter, we will introduce time series, the underlying statistical concepts, and how to apply them in everyday analysis. We will introduce the topic with the distinction between time-series data and what we have discussed up to this point in the book. We then provide an overview of what to expect with time-series modeling and the goals it can be leveraged to achieve. Within the context of time series, we then reintroduce the mean and variance statistical parameters, in addition to correlation. We provide an overview of linear differencing, cross-correlation, and autoregressive (AR) and moving average (MA) properties and how to identify their ordering using autocorrelation function (ACF) and partial ACF...

What is a time series?

In this chapter and the next few chapters, we will work with a type of data called time-series data. Up until this point, we have worked with independent data—that is, data consisting of samples that are not related. A time series is typically a measurement of the same sample taken over time, which makes the samples in this type of data related. There are many time series present around us every day. A few common examples of time series are daily temperature measurements, stock price ticks, and the heights of ocean tides. While a time series does not need to be measured at fixed intervals, in this book, we will primarily be concerned with measurements taken at fixed intervals, such as daily or every second.

Let’s look at some notation. In the following equation, we have a variable x that is repeatedly sampled over time. The subscripts enumerate the sample points (sample 1 through sample t), and the whole series of samples is denoted X. The subscript...

Goals of time series analysis

There are two goals in time-series analysis:

Identifying any patterns in the time series
Forecasting future values of the time series

We can use time-series analysis methods to uncover the nature of a time series. At the most basic level, we may want to know if a series appears to be random or if a time series appears to exhibit a pattern. If a time series has a pattern, we can determine if it has seasonal behavior, cyclical patterns, or exhibits trending behavior. We will investigate the behaviors of time series both by observation and by the results of fitting models. Models can provide insight into the nature of a series and allow us to forecast the future values of a time series.

The other goal of time-series analysis is forecasting. We see examples of forecasting in many common situations, such as weather forecasting and stock price forecasting. It is important to keep in mind that the methods of forecasting we cover in this...

Statistical measurements

When using time-series models to work with serially correlated data sets, we need to understand mean and variance – within the context of time – in addition to autocorrelation and cross-correlation. Understanding these variables helps build an intuition about how time-series models work and when they are more useful than models that do not account for time.

Mean

In time-series analysis, the sample mean of a series is the sum of all values across each point in time in the series divided by the count of values. Where t represents each discrete time step and n is the total number of time steps, we can calculate the sample mean of a time series as follows:

_ X = 1 _ n ∑ t=1 n x t

There are two types of processes generating time series; one is an ergodic process and the other is non-ergodic. An ergodic process has consistent output independent of time, whereas a non-ergodic...

The white-noise model

Any time series can be considered to process two fundamental elements: signal and noise. We can present this mathematically as follows:

y(t) = signal(t) + noise(t)

The signal is some predictable pattern that we can model with a mathematical function. But the noise element in a time series is unpredictable and so cannot be modeled. Thinking of a time series this way leads to two consequential points:

Before attempting to model, we should verify that the time series is not consistent with noise.
Once we have fit a model to a time series, we should verify that the residuals are consistent with noise.

Regarding the first point, if a time series is consistent with noise, there is no predictable pattern to model, and attempting to model the time series could lead to misleading results. About the second point, if the residuals of a time-series model are not consistent with noise, then there are additional patterns we can further model, and the...

Stationarity

In this section, we provide an overview of stationary and non-stationary time series. Broadly speaking, the main difference between these two types of time series is the statistical properties such as mean, variance, and autocorrelation. They do not vary across time in stationary time series but do change through time in non-stationary time series. Particularly, time series with a trend or seasonality is non-stationary because the trend or seasonality will affect the statistical properties. The following examples illustrate the behaviors of stationary versus non-stationary time series [1]:

Figure 10.12 – Examples of stationary and non-stationary time series

In order to check the stationary properties, we will check the three following conditions:

The mean is independent of time:

E[X t] = μ for all t

The variance is independent of time:

Var[X t] = σ 2 for all t

No autocorrelation...

Summary

This chapter started with an introduction to time series. We provided an overview of what a time series is and how it can be used to meet specific goals. We also discussed the criteria for differentiating time-series data from data that does not depend on time. We also discussed stationarity, which factors are important for stationarity, how to measure them, and how to resolve cases where stationarity does not exist. From there, we were able to understand the primary functions of ACF and PACF analysis and for making inferences about processes using variance around the mean. Additionally, we provided an introduction to time-series modeling with an overview of the white-noise model and the basic concepts behind autoregressive and moving average components, which help form the basis of ARIMA and seasonal autoregressive integrated moving average (SARIMA) time-series models.

In Chapter 11, ARIMA Models, we will also move deeper into the discussion of autoregressive, moving average...

References

[1] André Bauer, Automated Hybrid Time Series Forecasting: Design, Benchmarking, and Use Cases, University of Chicago, 2021.

The rest of the chapter is locked

You have been reading a chapter from

Building Statistical Models in Python

Published in: Aug 2023Publisher: PacktISBN-13: 9781804614280

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (3)

Huy Hoang Nguyen

Huy Hoang Nguyen is a Mathematician and a Data Scientist with far-ranging experience, championing advanced mathematics and strategic leadership, and applied machine learning research. He holds a Master's in Data Science and a PhD in Mathematics. His previous work was related to Partial Differential Equations, Functional Analysis and their applications in Fluid Mechanics. He transitioned from academia to the healthcare industry and has performed different Data Science projects from traditional Machine Learning to Deep Learning.
Read more about Huy Hoang Nguyen

Paul N Adams

Paul Adams is a Data Scientist with a background primarily in the healthcare industry. Paul applies statistics and machine learning in multiple areas of industry, focusing on projects in process engineering, process improvement, metrics and business rules development, anomaly detection, forecasting, clustering and classification. Paul holds a Master of Science in Data Science from Southern Methodist University.
Read more about Paul N Adams

Stuart J Miller

Stuart Miller is a Machine Learning Engineer with degrees in Data Science, Electrical Engineering, and Engineering Physics. Stuart has worked at several Fortune 500 companies, including Texas Instruments and StateFarm, where he built software that utilized statistical and machine learning techniques. Stuart is currently an engineer at Toyota Connected helping to build a more modern cockpit experience for drivers using machine learning.
Read more about Stuart J Miller

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages