Reader small image

You're reading from  Building Statistical Models in Python

Product typeBook
Published inAug 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781804614280
Edition1st Edition
Languages
Concepts
Right arrow
Authors (3):
Huy Hoang Nguyen
Huy Hoang Nguyen
author image
Huy Hoang Nguyen

Huy Hoang Nguyen is a Mathematician and a Data Scientist with far-ranging experience, championing advanced mathematics and strategic leadership, and applied machine learning research. He holds a Master's in Data Science and a PhD in Mathematics. His previous work was related to Partial Differential Equations, Functional Analysis and their applications in Fluid Mechanics. He transitioned from academia to the healthcare industry and has performed different Data Science projects from traditional Machine Learning to Deep Learning.
Read more about Huy Hoang Nguyen

Paul N Adams
Paul N Adams
author image
Paul N Adams

Paul Adams is a Data Scientist with a background primarily in the healthcare industry. Paul applies statistics and machine learning in multiple areas of industry, focusing on projects in process engineering, process improvement, metrics and business rules development, anomaly detection, forecasting, clustering and classification. Paul holds a Master of Science in Data Science from Southern Methodist University.
Read more about Paul N Adams

Stuart J Miller
Stuart J Miller
author image
Stuart J Miller

Stuart Miller is a Machine Learning Engineer with degrees in Data Science, Electrical Engineering, and Engineering Physics. Stuart has worked at several Fortune 500 companies, including Texas Instruments and StateFarm, where he built software that utilized statistical and machine learning techniques. Stuart is currently an engineer at Toyota Connected helping to build a more modern cockpit experience for drivers using machine learning.
Read more about Stuart J Miller

View More author details
Right arrow

Multivariate Time Series

The models we discussed in the previous chapter only depended on the previous values of the single variable of interest. Those models are appropriate when we only have a single variable in our time series. However, it is common to have multiple variables in time-series data. Often, these other variables in the series can improve forecasting of the variable of interest. We will discuss models for time series with multiple variables in this chapter. We will first discuss the correlation relationship between time-series variables, then discuss how we can model multivariate time series. While there are many models for multivariate time-series data, we will discuss two models that are both powerful and widely used: autoregressive integrated moving average with exogenous variables (ARIMAX) and vector autoregressive (VAR). Understanding these two models will extend the reader’s model toolbox and provide building blocks for the reader to learn more about multivariate...

Multivariate time series

In the previous chapter, we discussed models for univariate time series or a time series of one variable. However, in many modeling situations, it is common to have multiple time-varying variables that are measured together. A time series consisting of multiple time-varying variables is called a multivariate time series. Each variable in the time series is called a covariate. For example, a time series of weather data might include temperature, rain amount, wind speed, and relative humidity. Each of these variables, in the weather dataset, is a univariate time series, and together, a multivariate time series and each pair of variables are covariates.

Mathematically, we typically represent a multivariate time-series as a vector-valued series, as follows:

X = x 0,0 x 0,1  , x 1,0 x 1,1  , , x t,0 x t,1  

Here, each X instance consists of multiple...

ARIMAX

In the previous chapter, we discussed the ARIMA family of models and demonstrated how to model univariate time-series data. However, as we mentioned in the previous section, many time series are multivariate, such as stock data, weather data, or economic data. In this section, we will discuss how we can incorporate information from covariate variables when modeling time-series data.

When we model a multivariate time series, we typically have a variable we are interested in forecasting. This variable is commonly called the endogenous variable. The other covariates in the multivariate time series are called exogenous variables. Recall from Chapter 11, ARIMA Models, the equation representing the ARIMA model:

y t = c + ϕ 1 y t1 + + ϕ p  y   tp + ϵ t + θ 1 ϵ t1 + + ϕ q ϵ tq

Here, y ...

VAR modeling

The AR(p), MA(q), ARMA(p,q), ARIMA(p,d,q)m, and SARIMA(p,d,q) models we looked at in the last chapter form the basis of multivariate VAR modeling. In this chapter, we have discussed ARIMA with exogenous variables (ARIMAX). We will now begin discussion on the VAR model. First, it is important to understand that while ARIMAX requires leading (future) values of the exogenous variables, no future values of these variables are required for the VAR model as they are all autoregressive to each other – hence the name vector autoregressive – and by definition not exogenous. To start, let us consider the two-variable, or bivariate, case. Consider a process y t that is the output of two different input variables, y t1 and y t2. Note that in matrix form, we are discussing the case of an nxm matrix (y n,m) where n corresponds to the point in time and m corresponds to the variables involved (variables 1,2, , m). We exclude the comma from notation...

Summary

In this chapter, we provided an overview of multivariate time-series and how they differ from the univariate case. We then covered the math and intuition behind two popular approaches to solving problems using multivariate time-series models—ARIMAX and the VAR model framework. We walked through examples for each model using a step-by-step approach. This chapter concludes our discussions on time-series analysis and forecasting. At this point, you should be able to identify and assess the statistical properties of time series, transform them as needed, and construct models that are useful for fitting and forecasting both univariate and multivariate cases.

In the next chapter, we will begin our discussion on survival analysis with an introduction to time-to-event (TTE) variables.

References

[1] Liang, X., Zou, T., Guo, B., Li, S., Zhang, H., Zhang, S., Huang, H. and Chen, S. X. (2015). Assessing Beijing’s PM2.5 pollution: severity, weather impact, APEC and winter heating. Proceedings of the Royal Society A, 471, 20150257.

[2] Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Building Statistical Models in Python
Published in: Aug 2023Publisher: PacktISBN-13: 9781804614280
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (3)

author image
Huy Hoang Nguyen

Huy Hoang Nguyen is a Mathematician and a Data Scientist with far-ranging experience, championing advanced mathematics and strategic leadership, and applied machine learning research. He holds a Master's in Data Science and a PhD in Mathematics. His previous work was related to Partial Differential Equations, Functional Analysis and their applications in Fluid Mechanics. He transitioned from academia to the healthcare industry and has performed different Data Science projects from traditional Machine Learning to Deep Learning.
Read more about Huy Hoang Nguyen

author image
Paul N Adams

Paul Adams is a Data Scientist with a background primarily in the healthcare industry. Paul applies statistics and machine learning in multiple areas of industry, focusing on projects in process engineering, process improvement, metrics and business rules development, anomaly detection, forecasting, clustering and classification. Paul holds a Master of Science in Data Science from Southern Methodist University.
Read more about Paul N Adams

author image
Stuart J Miller

Stuart Miller is a Machine Learning Engineer with degrees in Data Science, Electrical Engineering, and Engineering Physics. Stuart has worked at several Fortune 500 companies, including Texas Instruments and StateFarm, where he built software that utilized statistical and machine learning techniques. Stuart is currently an engineer at Toyota Connected helping to build a more modern cockpit experience for drivers using machine learning.
Read more about Stuart J Miller