Reader small image

You're reading from  R for Data Science Cookbook (n)

Product typeBook
Published inJul 2016
Reading LevelIntermediate
Publisher
ISBN-139781784390815
Edition1st Edition
Languages
Tools
Concepts
Right arrow
Author (1)
Yu-Wei, Chiu (David Chiu)
Yu-Wei, Chiu (David Chiu)
author image
Yu-Wei, Chiu (David Chiu)

Yu-Wei, Chiu (David Chiu) is the founder of LargitData (www.LargitData.com), a startup company that mainly focuses on providing big data and machine learning products. He has previously worked for Trend Micro as a software engineer, where he was responsible for building big data platforms for business intelligence and customer relationship management systems. In addition to being a start-up entrepreneur and data scientist, he specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis. Yu-Wei is also a professional lecturer and has delivered lectures on big data and machine learning in R and Python, and given tech talks at a variety of conferences. In 2015, Yu-Wei wrote Machine Learning with R Cookbook, Packt Publishing. In 2013, Yu-Wei reviewed Bioinformatics with R Cookbook, Packt Publishing. For more information, please visit his personal website at www.ywchiu.com. **********************************Acknowledgement************************************** I have immense gratitude for my family and friends for supporting and encouraging me to complete this book. I would like to sincerely thank my mother, Ming-Yang Huang (Miranda Huang); my mentor, Man-Kwan Shan; the proofreader of this book, Brendan Fisher; Members of LargitData; Data Science Program (DSP); and other friends who have offered their support.
Read more about Yu-Wei, Chiu (David Chiu)

Right arrow

Chapter 10. Time Series Mining with R

This chapter covers the following topics:

  • Creating time series data

  • Plotting a time series object

  • Decomposing time series

  • Smoothing time series

  • Forecasting time series

  • Selecting an ARIMA model

  • Creating an ARIMA model

  • Forecasting with an ARIMA model

  • Predicting stock prices with an ARIMA model

Introduction


The first example of time series analysis in human history occurred in ancient Egypt. The ancient Egyptians recorded the inundation (rise) and relinquishment (fall) of the Nile river every day, noting when fertile silt and moisture occurred. Based on these records, they found that the inundation period began when the sun rose at the same time as the Sirius star system became visible. By being able to predict the inundation period, the ancient Egyptians were able to make sophisticated agricultural decisions, greatly improving the yield of their farming activities.

As demonstrated by the ancient Egyptian inundation period example, time series analysis is a method that can extract patterns or meaningful statistics from data with temporal information. It allows us to forecast future values based on observed results. One can apply time series analysis to any data that has temporal information. For example, an economist can perform time series analysis to predict the GDP growth rate...

Creating time series data


To begin time series analysis, we need to create a time series object from a numeric vector or matrix. In this recipe, we introduce how to create a time series object from the financial report of Taiwan Semiconductor (2330.TW) with the ts function.

Getting ready

Download the tw2330_finance.csv dataset from the following GitHub link:

https://github.com/ywchiu/rcookbook/raw/master/chapter10/tw2330_finance.csv

How to do it…

Please perform the following steps to create time series data:

  1. First, read Taiwan Semiconductor's financial report into an R session:

    > tw2330 = read.csv('tw2330_finance.csv', header=TRUE)
    > str(tw2330)
    'data.frame': 32 obs. of  5 variables:
     $ Time            : Factor w/ 32 levels "2008Q1","2008Q2",..: 1 2 3 4 5 6 7 8 9 10 ...
     $ Total.Income    : int  875 881 930 646 395 742 899 921 922 1050 ...
     $ Gross.Sales     : num  382 402 431 202 74.8 343 429 447 442 519 ...
     $ Operating.Income: num  291 304 329 120 12.1 251 320 336 341 405 ...
     $ EPS ...

Plotting a time series object


Plotting a time series object will make trends and seasonal composition clearly visible. In this recipe, we introduce how to plot time series data with the plot.ts function.

Getting ready

Ensure you have completed the previous recipe by generating a time series object and storing it in two variables: m and m_ts.

How to do it…

Please perform the following steps to plot time series data:

  1. First, use the plot.ts function to plot time series data, m:

    > plot.ts(m)
    

    Figure 1: A time series plot of single time series data

  2. Also, if the dataset contains multiple time series objects, you can plot multiple time series data in a separate sub-figure:

    > plot.ts(m_ts, plot.type = "multiple",)
    

    Figure 2: A time series plot of multiple time series data

  3. Alternatively, you can plot all four time series objects in a single figure:

    > plot.ts(m_ts, plot.type = "single", col=c("red","green","blue", "orange"))
    

    Figure 3: A multiple time series plot in different colors

  4. Moreover, you can...

Decomposing time series


A seasonal time series is made up of seasonal components, deterministic trend components, and irregular components. In this recipe, we introduce how to use the decompose function to destruct a time series into these three parts.

Getting ready

Ensure you have completed the previous recipe by generating a time series object and storing it in two variables: m and m_ts.

How to do it…

Please perform the following steps to decompose a time series:

  1. First, use the window function to construct a time series object, m.sub, from m:

    > m.sub = window(m, start=c(2012, 1), end=c(2014, 4)) 
    > m.sub
         Qtr1 Qtr2 Qtr3 Qtr4
    2012 1055 1281 1414 1313
    2013 1328 1559 1626 1458
    2014 1482 1830 2090 2225
    > plot(m.sub)
    

    Figure 6: A time series plot in a quarter

  2. Use the decompose function to destruct the time series object m.sub:

    > components <- decompose(m.sub)
    
  3. We can then use the names function to list the attributes of components:

    > names(components)
    [1] "x"        "seasonal...

Smoothing time series


Time series decomposition allows us to extract distinct components from time series data. The smoothing technique enables us to forecast the future values of time series data. In this recipe, we introduce how to use the HoltWinters function to smooth time series data.

Getting ready

Ensure you have completed the previous recipe by generating a time series object and storing it in two variables: m and m_ts.

How to do it…

Please perform the following steps to smooth time series data:

  1. First, use HoltWinters to perform Winters exponential smoothing:

    > m.pre <- HoltWinters(m)
    > m.pre
    Holt-Winters exponential smoothing with trend and additive seasonal component.
    
    Call:
    HoltWinters(x = m)
    
    Smoothing parameters:
     alpha: 0.8223689
     beta : 0.06468208
     gamma: 1
    
    Coefficients:
             [,1]
    a  1964.30088
    b    32.33727
    s1  -51.47814
    s2   17.84420
    s3  146.26704
    s4   70.69912
    
  2. Plot the smoothing result:

    > plot(m.pre)
    

    Figure 9: A time series plot with Winters exponential smoothed...

Forecasting time series


After using HoltWinters to build a time series smoothing model, we can now forecast future values based on the smoothing model. In this recipe, we introduce how to use the forecast function to make a prediction on time series data.

Getting ready

In this recipe, you have to have completed the previous recipe by generating a smoothing model with HoltWinters and have it stored in a variable, m.pre.

How to do it…

Please perform the following steps to forecast Taiwan Semiconductor's future income:

  1. Load the forecast package:

    > library(forecast)
    
  2. We can use the forecast function to predict the income of the next four quarters:

    > income.pre <- forecast.HoltWinters(m.pre, h=4)
    > summary(income.pre)
    
    Forecast method: HoltWinters
    
    Model Information:
    Holt-Winters exponential smoothing with trend and additive seasonal component.
    
    Call:
    HoltWinters(x = m)
    
    Smoothing parameters:
     alpha: 0.8223689
     beta : 0.06468208
     gamma: 1
    
    Coefficients:
             [,1]
    a  1964.30088
    b   ...

Selecting an ARIMA model


Using the exponential smoothing method requires that residuals are non-correlated. However, in real-life cases, it is quite unlikely that none of the continuous values correlate with each other. Instead, one can use ARIMA in R to build a time series model that takes autocorrelation into consideration. In this recipe, we introduce how to use ARIMA to build a smoothing model.

Getting ready

In this recipe, we use time series data simulated from an ARIMA process.

How to do it…

Please perform the following steps to select the ARIMA model's parameters:

  1. First, simulate an ARIMA process and generate time series data with the arima.sim function:

    > set.seed(123)
    > ts.sim <- arima.sim(list(order = c(1,1,0), ar = 0.7), n = 100)
    > plot(ts.sim)
    

    Figure 14: Simulated time series data

  2. We can then take the difference of the time series:

    > ts.sim.diff <- diff(ts.sim)
    
  3. Plot the differenced time series:

    > plot(ts.sim.diff)
    

    Figure 15: A differenced time series plot

  4. Use the...

Creating an ARIMA model


After determining the optimum p, d, and q parameters for an ARIMA model, we can now create an ARIMA model with the Arima function.

Getting ready

Ensure you have completed the previous recipe by generating a time series object and storing it in a variable, ts.sim.

How to do it…

Please perform the following steps to build an ARIMA model:

  1. First, we can create an ARIMA model with time series ts.sim, with parameters p=1, d=1, q=0:

    > library(forecast)
    > fit <- Arima(ts.sim, order=c(1,1,0))
    > fit
    Series: ts.sim 
    ARIMA(1,1,0)                    
    
    Coefficients:
             ar1
          0.7128
    s.e.  0.0685
    
    sigma^2 estimated as 0.7603:  log likelihood=-128.04
    AIC=260.09   AICc=260.21   BIC=265.3
    
  2. Next, use the accuracy function to print the training set errors of the model:

    > accuracy(fit)
                          ME     RMSE       MAE       MPE
    Training set 0.004938457 0.863265 0.6849681 -41.98798
                     MAPE      MASE          ACF1
    Training set 102.2542 0.7038325...

Forecasting with an ARIMA model


Based on our fitted ARIMA model, we can predict future values. In this recipe, we will introduce how to forecast future values with the forecast.Arima function in the forecast package.

Getting ready

Ensure you have completed the previous recipe by generating an ARIMA model and storing the model in a variable, fit.

How to do it…

Please perform the following steps to forecast future values with forecast.Arima:

  1. First, use forecast.Arima to generate the prediction of future values:

    > fit.predict <- forecast.Arima(fit)
    
  2. We can then use the summary function to obtain the summary of our prediction:

    > summary(fit.predict)
    
    Forecast method: ARIMA(1,1,0)
    
    Model Information:
    Series: ts.sim 
    ARIMA(1,1,0)                    
    
    Coefficients:
             ar1
          0.7128
    s.e.  0.0685
    
    sigma^2 estimated as 0.7603:  log likelihood=-128.04
    AIC=260.09   AICc=260.21   BIC=265.3
    
    Error measures:
                          ME     RMSE       MAE       MPE
    Training set 0.004938457 0.863265...

Predicting stock prices with an ARIMA model


As the historical prices of a stock are also a time series, we can thus build an ARIMA model to forecast future prices of a given stock. In this recipe, we introduce how to load historical prices with the quantmod package, and make predictions on stock prices with ARIMA.

Getting ready

In this recipe, we use the example of stock price prediction to review all the concepts we have covered in previous topics. You will require knowledge of how to create an ARIMA model and make predictions based on a built model to follow this recipe.

How to do it…

Please perform the following steps to predict Facebook's stock price with the ARIMA model:

  1. First, install and load the quantmod package:

    > install.packages("quantmod")
    > library(quantmod)
    
  2. Download the historical prices of Facebook Inc from Yahoo with quantmod:

    > getSymbols("FB",src="yahoo", from="2015-01-01")
    
  3. Next, plot the historical stock prices as a line chart:

    > plot(FB)
    

    Figure 23: A historical...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
R for Data Science Cookbook (n)
Published in: Jul 2016Publisher: ISBN-13: 9781784390815
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Yu-Wei, Chiu (David Chiu)

Yu-Wei, Chiu (David Chiu) is the founder of LargitData (www.LargitData.com), a startup company that mainly focuses on providing big data and machine learning products. He has previously worked for Trend Micro as a software engineer, where he was responsible for building big data platforms for business intelligence and customer relationship management systems. In addition to being a start-up entrepreneur and data scientist, he specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis. Yu-Wei is also a professional lecturer and has delivered lectures on big data and machine learning in R and Python, and given tech talks at a variety of conferences. In 2015, Yu-Wei wrote Machine Learning with R Cookbook, Packt Publishing. In 2013, Yu-Wei reviewed Bioinformatics with R Cookbook, Packt Publishing. For more information, please visit his personal website at www.ywchiu.com. **********************************Acknowledgement************************************** I have immense gratitude for my family and friends for supporting and encouraging me to complete this book. I would like to sincerely thank my mother, Ming-Yang Huang (Miranda Huang); my mentor, Man-Kwan Shan; the proofreader of this book, Brendan Fisher; Members of LargitData; Data Science Program (DSP); and other friends who have offered their support.
Read more about Yu-Wei, Chiu (David Chiu)