You're reading from Hands-On Time Series Analysis with R

Product typeBook

Published inMay 2019

Reading LevelBeginner

PublisherPackt

ISBN-139781788629157

Edition1st Edition

Languages

Concepts

Data Analysis

Author (1)

Rami Krispin

Forecasting with ARIMA Models

The Autoregressive Integrated Moving Average (ARIMA) model is the generic name for a family of forecasting models that are based on the Autoregressive (AR) and Moving Average (MA) processes. Among the traditional forecasting models (for example, linear regression, exponential smoothing, and so on), the ARIMA model is considered as the most advanced and robust approach. In this chapter, we will introduce the model components—the AR and MA processes and the differencing component. Furthermore, we will focus on methods and approaches for tuning the model's parameters with the use of differencing, the autocorrelation function (ACF), and the partial autocorrelation function (PACF).

In this chapter, we will cover the following topics:

The stationary state of time series data
The random walk process
The AR and MA processes
The ARMA and ARIMA...

Technical requirement

The following packages will be used in this chapter:

forecast: Version 8.5 and above
TSstudio: Version 0.1.4 and above
plotly: Version 4.8 and above
dplyr: Version 0.8.1 and above
lubridate: Version 1.7.4 and above
stats: Version 3.6.0 and above
datasets: Version 3.6.0 and above
base: Version 3.6.0 and above

You can access the codes for this chapter from the following link:

https://github.com/PacktPublishing/Hands-On-Time-Series-Analysis-with-R/tree/master/Chapter11

The stationary process

One of the main assumptions of the ARIMA family of models is that the input series follows the stationary process structure. This assumption is based on the Wold representation theorem, which states that any stationary process can be represented as a linear combination of white noise. Therefore, before we dive into the ARIMA model components, let's pause and talk about the stationary process. The stationary process, in the context of time series data, describes a stochastic state of the series. Time series data is stationary if the following conditions are taking place:

The mean and variance of the series do not change over time
The correlation structure of the series, along with its lags, remains the same over time

In the following examples, we will utilize the arima.sim function from the stats package to simulate a stationary and non-stationary...

The AR process

The AR process defines the current value of the series, Y_t, as a linear combination of the previous p lags of the series, and can be formalized with the following equation:

Following are the terms used in the preceding equation:

AR(p) is the notation for an AR process with p-order
c represents a constant (or drift)
p defines the number of lags to regress against Y_t

is the coefficient of the i lag of the series (here, must be between -1 and 1, otherwise, the series would be trending up or down and therefore cannot be stationary over time)
Y_t-i is the i lag of the series
∈_t represents the error term, which is white noise

An AR process can be used on time series data if, and only if, the series is stationary. Therefore, before applying an AR process on a series, you will have to verify that the series is stationary. Otherwise, you will have to apply some...

The moving average process

In some cases, the forecasting model is unable to capture all the series patterns, and therefore some information is left over in model residuals (or forecasting error) . The goal of the moving average process is to capture patterns in the residuals, if they exist, by modeling the relationship between Y_t, the error term, ∈_t, and the past q error terms of the models (for example, ). The structure of the MA process is fairly similar to the ones of the AR. The following equation defines an MA process with a q order:

The following terms are used in the preceding equation:

MA(q) is the notation for an MA process with q-order
represents the mean of the series
are white noise error terms
is the corresponding coefficient of
q defines the number of past error terms to be used in the equation

Like the AR process, the MA equation holds only if the...

The ARMA model

Up until now, we have seen how the applications of AR and MA are processed separately. However, in some cases, combining the two allows us to handle more complex time series data. The ARMA model is a combination of the AR(p) and MA(q) processes and can be written as follows:

The following terms are used in the preceding equation:

ARMA(p,q) defines an ARMA process with a p-order AR process and q-order moving average process
Y_t represents the series itself
c represents a constant (or drift)
p defines the number of lags to regress against Y_t
is the coefficient of the i lag of the series
Y_t-1 is the i lag of the series
q defines the number of past error terms to be used in the equation
is the corresponding coefficient of
are white noise error terms
represents the error term, which is white noise

For instance, an ARMA(3,2) model is defined by the following equation...

Forecasting AR, MA, and ARMA models

Forecasting any of the models we saw until now was straightforward: we used the forecast function from the forecast package in a similar manner to how we used it in the previous chapter. For instance, the following code demonstrates the forecast of the next 100 observations of the AR model we trained previously in The AR process section with the ar function:

ar_fc <- forecast(md_ar, h = 100)

We can use plot_forecast to plot the forecast output:

plot_forecast(ar_fc, 
 title = "Forecast AR(2) Model",
 Ytitle = "Value",
 Xtitle = "Year")

We get the following output:

The ARIMA model

One of the limitations of the AR, MA, and ARMA models is that they cannot handle non-stationary time series data. Therefore, if the input series is non-stationary, a preprocessing step is required to transform the series from a non-stationary state into a stationary state. The ARIMA model provides a solution for this issue by adding the integrated process for the ARMA model. The Integrated (I) process is simply differencing the series with its lags, where the degree of the differencing is represented by the d parameter. The differencing process, as we saw previously, is one of the ways you can transform the methods of a series from non-stationary to stationary. For instance, Y_t - Y_t-1 represents the first differencing of the series, while (Y_t - Y_t-1) - (Y_t-1 - Y_t-2) represents the second differencing. We can generalize the differencing process with the following...

The seasonal ARIMA model

The Seasonal ARIMA (SARIMA) model, as its name implies, is a designated version of the ARIMA model for time series with a seasonal component. As we saw in Chapter 6, Seasonality Analysis, and Chapter 7, Correlation Analysis, a time series with a seasonal component has a strong relationship with its seasonal lags. The SARIMA model is utilizing the seasonal lags in a similar manner to how the ARIMA model is utilizing the non-seasonal lags with the AR and MA processes and differencing. It does this by adding the following three components to the ARIMA model:

SAR(P) process: A seasonal AR process of the series with its past P seasonal lags. For example, a SAR(2) is an AR process of the series with its past two seasonal lags, that is, , where Φ represents the seasonal coefficient of the SAR process, and f represents the series frequency.
SMA(Q) process...

The auto.arima function

One of the main challenges of forecasting with the ARIMA family of models is the cumbersome tuning process of the models. As we saw in this chapter, this process includes many manual steps that are required for verifying the structure of the series (stationary or non-stationary), data transformations, descriptive analysis with the ACF and PACF plots to identify the type of process, and eventually tune the model parameters. While it might take a few minutes to train an ARIMA model for a single series, it may not scale up if you have dozens of series to forecast.

The auto.arima function from the forecast package provides a solution to this issue. This algorithm automates the tuning process of the ARIMA model with the use of statistical methods to identify both the structure of the series (stationary or not) and type (seasonal or not), and sets the model&apos...

Linear regression with ARIMA errors

In Chapter 9, Forecasting with Linear Regression, we saw that with some simple steps, we can utilize a linear regression model as a time series forecasting model. Recall that a general form of the linear regression model can be represented by the following equation:

One of the main assumptions of the linear regression model is that the error term of the series, , is the white noise series (for example, there is no correlation between the residuals and their lags). However, when working with time series data, this assumption is eased as, typically, the model predictors do not explain all the variations of the series, and some patterns are left on the model residuals. An example of the failure of this assumption can be seen while fitting a linear regression model to forecast the AirPassenger series.

...

Summary

In this chapter, we introduced the ARIMA family of models, one of the core approaches for forecasting time series data. The main advantages of the ARIMA family of models is their flexibility and modularity, as they can handle both seasonal and non-seasonal time series data by adding or modifying the model components. In addition, we saw the applications of the ACF and PACF plots for identifying the type of process (for example, AR, MA, ARMA, and so on) and its order.

While it is essential to be familiar with the tuning process of ARIMA models, in practice, as the number series to be forecast increase, you may want to automate this process. The auto.arima function is one of the most common approaches in R to forecast with ARIMA models as it can scale up when dozens of series need to be forecast.

Last but not least, we saw applications of linear regression with the ARIMA...

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Time Series Analysis with R

Published in: May 2019Publisher: PacktISBN-13: 9781788629157

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Rami Krispin

Rami Krispin is a data scientist at a major Silicon Valley company, where he focuses on time series analysis and forecasting. In his free time, he also develops open source tools and is the author of several R packages, including the TSstudio package for time series analysis and forecasting applications. Rami holds an MA in Applied Economics and an MS in actuarial mathematics from the University of MichiganAnn Arbor.
Read more about Rami Krispin

Other recommended products

Related to this chapter

Practical Time Series Analysis

Practical Time Series Analysis will introduce you to the basic concepts of time series analysis and describe powerful yet simple techniques in Python which data scientists and data engineers would find useful in dealing with real life datasets in industrial settings. This book focuses on explaining important concepts and practical techniques to process, summarize and model time series data. Real life case studies with code snippets in Python are used to demonstrate the concepts and techniques.

BookSep 2017244 pages

Learning Quantitative Finance with R

This book covers applications of quantitative finance in R. It starts with the basics of quantitative finance and goes to complexity at the end of the book along with a varying degree of R complexity. This will guide you to implement different trading strategies for various financial instruments using basic to complex techniques along with its optimization and keeping the risk of financial instruments in check.

BookMar 2017284 pages

Hands-On Ensemble Learning with R

This book introduces you to the concept of ensemble learning and demonstrates how different machine learning algorithms can be combined to build efficient machine learning models. Use R to implement the popular trilogy of ensemble techniques, i.e. bagging, random forest and boosting, to build faster and more accurate machine learning models.

BookJul 2018376 pages

Machine Learning with R Cookbook

The R language is a powerful open source functional programming language. At its core, R is a statistical language that provides impressive tools to analyze data and create high-level graphics. This book covers the basics of R by setting up a user-friendly programming environment and programming ETL in R. Data exploration examples are provided that demonstrate how powerful data visualisation and machine learning is in discovering hidden relationships. You will also explore air quality data, steps to fix the missing values and visualising the same. You will then dive into important machine learning topics, including data classification, regression, survival analysis, time series analysis, clustering association rule mining, and dimension reduction.This book will include the latest code and examples based on R 3.3 and above—updated for better computation, accuracy, and speed with R.

BookOct 2017572 pages

Python for Finance Cookbook

Python is becoming the number one language for data science and also quantitative finance. This book provides you with solutions to common tasks from the intersection of quantitative finance and data science, using modern Python libraries.

BookJan 2020432 pages

Forecasting Time Series Data with Facebook Prophet

This book will help you get to grips with the task of time series forecasting using the leading open source forecasting tool available to the public, Facebook Prophet. You will learn how to implement the advanced features of Prophet to build forecast models and understand why and how to modify each of the default parameters to improve results.

BookMar 2021270 pages

R Data Analysis Cookbook

Data analytics with R has emerged as a very important focus for organizations of all kinds. R enables even those with only an intuitive grasp of the underlying concepts, without a deep mathematical background, to unleash powerful and detailed examinations of their data. This book empowers you by showing you ways to use R to generate professional analysis reports. The book also teaches you to quickly adapt the example code for your own needs and save yourself the time needed to construct code from scratch.

BookSep 2017560 pages

Mastering Machine Learning with R

Machine learning is a field of AI where we build systems that learn from data. This book explains complicated concepts with real-world applications. It demonstrates the power of R and machine learning extensively while highlighting the constraints. Finally, it will walk you through topics such as text analysis, time series, and deep learning.

BookJan 2019354 pages

Practical Data Science Cookbook

As an increasing amount of data is generated each year, and the need to analyze and operationalize it is more important than ever. Companies that know what to do with their data have a competitive advantage over companies that don't. This drives a higher demand for knowledgeable and competent data professionals. By sequentially working through the steps presented in each chapter, you will quickly familiarize yourself with the data science process, and learn how to apply it to a variety of situations with examples using the two most popular programming languages for data analysis - R and Python.

BookJun 2017434 pages

Regression Analysis with R

Regression analysis is a statistical process which enables prediction of relationships between variables. This book will give you a rundown explaining what regression analysis is, explaining you the process from scratch. Each chapter starts with explaining the theoretical concepts and once the reader gets comfortable with the theory, we move to the practical examples to support the understanding. By the end of this book you will know all the concepts and pain-points related to regression analysis, and you will be able to implement your learning in your projects.

BookJan 2018422 pages

R Statistics Cookbook

With this book, you will learn to execute a series of intermediate to advanced statistical tasks as you walk through each chapter. You will not only get well versed with the traditional statistics but you will also cover the necessary statistics required for machine learning and deep learning concepts.

BookMar 2019448 pages3

Mastering Machine Learning with R

Machine learning is the field of Artificial Intelligence where we build systems that learn from data. Given the growing prominence of R—a cross-platform, zero-cost statistical programming environment—there has never been a better time to start applying machine learning to your data. This book will teach you advanced techniques in machine learning with the latest code in R 3.3.2.

BookApr 2017420 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages