Reader small image

You're reading from  Hands-On Time Series Analysis with R

Product typeBook
Published inMay 2019
Reading LevelBeginner
PublisherPackt
ISBN-139781788629157
Edition1st Edition
Languages
Right arrow
Author (1)
Rami Krispin
Rami Krispin
author image
Rami Krispin

Rami Krispin is a data scientist at a major Silicon Valley company, where he focuses on time series analysis and forecasting. In his free time, he also develops open source tools and is the author of several R packages, including the TSstudio package for time series analysis and forecasting applications. Rami holds an MA in Applied Economics and an MS in actuarial mathematics from the University of MichiganAnn Arbor.
Read more about Rami Krispin

Right arrow

Preface

Time series analysis is the art of extracting meaningful insights and revealing patterns from time series data using statistical and data visualization approaches. These insights and patterns can then be utilized to explore past events and forecast future values in the series.

This book goes through all the steps of the time series analysis process, from getting the raw data, to building a forecasting model using R. You will learn how to use tools from packages such as stats, lubridate, xts, and zoo to clean and reformat your raw data into structural time series data. As you make your way through Hands-On Time Series Analysis with R, you will analyze data and extract meaningful information from it using both descriptive statistics and rich data visualization tools in R, such as the TSstudio, plotly, and ggplot2 packages. The latter part of the book delves into traditional forecasting models such as time series regression models, exponential smoothing, and autoregressive integrated moving average (ARIMA) models using the forecast package. Last but not least, you will learn how to utilize machine learning models such as Random Forest and Gradient Boosting Machine to forecast time series data with the h2o package.

Who this book is for

This book is ideal for the following groups of people:

  • Data scientists who wish to learn how to perform time series analysis and forecasting with R.
  • Data analysts who perform Excel-based time series analysis and forecasting and wish to take their forecasting skills to the next level.

Basic knowledge of statistics (for example, regression analysis and hypothesis testing) is required, and some knowledge of R is expected but is not mandatory (for those who never use R, Chapter 1, Introduction to Time Series Analysis and R, provides a brief introduction).

What this book covers

Chapter 1, Introduction to Time Series Analysis and R, provides a brief introduction to the time series analysis process and defines the attributes and characteristics of time series data. In addition, the chapter provides a brief introduction to R for readers with no prior knowledge of R. This includes the mathematical and logical operators, loading data from multiple sources (such as flat files and APIs), installing packages, and so on.

Chapter 2, Working with Date and Time Objects, focuses on the main date and time classes in R—the Date and POSIXct/lt classes—and their attributes. This includes ways to reformat date and time objects with the base and lubridate packages.

Chapter 3, The Time Series Object, focuses on the ts class, an R core class for time series data. This chapter dives deep into the attributes of the ts class, methods for creating and manipulating ts objects using tools from the stats package, and data visualization applications with the TSstudio and dygraphs packages.

Chapter 4, Working with zoo and xts Objects, covers the applications of the zoo and xts classes, an advanced format for time series data. This chapter focuses on the attributes of the zoo and xts classes and the preprocessing and data visualization tools from the zoo and xts packages

Chapter 5, Decomposition of Time Series Data, focuses on decomposing time series data down to its structural patterns—the trend, seasonal, cycle, and random components. Starting with the moving average function, the chapter explains how to use the function for smoothing, and then focuses on decomposing a time series to down its components with the moving average.

Chapter 6, Seasonality Analysis, explains approaches and methods for exploring and revealing seasonal patterns in time series data. This includes the use of summary statistics, along with data visualization tools from the forecast, TSstudio, and ggplot2 packages.

Chapter 7, Correlation Analysis, focuses on methods and techniques for analyzing the relationship between time series data and its lags or other series. This chapter provides a general background for correlation analysis, and introduces statistical methods and data visualization tools for measuring the correlation between time series and its lags or between multiple time series.

Chapter 8, Forecasting Strategies, introduces approaches, strategies, and tools for building time series forecasting models. This chapter covers different training strategies, different error metrics, benchmarking, and evaluation methods for forecasting models.

Chapter 9, Forecasting with Linear Regression, dives into the forecasting applications of the linear regression model. This chapter explains how to model the different components of a series with linear regression by creating new features from the series. In addition, this chapter covers the advanced modeling of structural breaks, outliers, holidays, and time series with multiple seasonality.

Chapter 10, Forecasting with Exponential Smoothing Models, focuses on forecasting time series data with exponential smoothing functions. This chapter explains the usage of smoothing parameters to forecast time series data. This includes simplistic models such as simple exponential smoothing, which is for time series with neither trend nor seasonal components, to advanced smoothing models such as Holt-Winters forecasting, which is for forecasting time series with both trend and seasonal components.

Chapter 11, Forecasting with ARIMA Models, covers the ARIMA family of forecasting models. This chapter introduces the different types of ARIMA models—the autoregressive (AR), moving average (MA), ARMA, ARIMA, and seasonal ARIMA (SARIMA) models. In addition, the chapter focuses on methods and approaches to identify, tune, and optimize ARIMA models with both autocorrelation and partial correlation functions using applications from the stats and forecast packages.

Chapter 12, Forecasting with Machine Learning Models, focuses on methods and approaches for forecasting time series data with machine learning models with the h2o package. This chapter explains the different steps of modeling time series data with machine learning models. This includes feature engineering, training and tuning approaches, evaluation, and benchmarking a forecasting model's performance.

To get the most out of this book

This book was written under the assumption that its readers have the following knowledge and skills:

  • Basic knowledge of statistics or econometrics, which includes topics such as regression modeling, hypothesis testing, normal distribution, and so on
  • Experience with R, or another programming language

You will need to have R installed (https://cran.r-project.org/) and it is recommended to install the RStudio IDE (https://www.rstudio.com/products/rstudio/).

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

  1. Log in or register at www.packt.com.
  2. Select the SUPPORT tab.
  3. Click on Code Downloads & Errata.
  4. Enter the name of the book in the Search box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR/7-Zip for Windows
  • Zipeg/iZip/UnRarX for Mac
  • 7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Hands-On-Time-Series-Analysis-with-R. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "We will use the Sys.Date and Sys.time functions to pull date and time objects respectively."

A block of code is set as follows:

library(TSstudio)

data(USgas)

The output of the R code is prefixed by the ## sign:

ts_info(USgas)
##  The USgas series is a ts object with 1 variable and 227 observations
## Frequency: 12
## Start time: 2000 1
## End time: 2018 11

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Select System info from the Administration panel."

Warnings or important notes appear like this.
Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at customercare@packtpub.com.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Time Series Analysis with R
Published in: May 2019Publisher: PacktISBN-13: 9781788629157
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Rami Krispin

Rami Krispin is a data scientist at a major Silicon Valley company, where he focuses on time series analysis and forecasting. In his free time, he also develops open source tools and is the author of several R packages, including the TSstudio package for time series analysis and forecasting applications. Rami holds an MA in Applied Economics and an MS in actuarial mathematics from the University of MichiganAnn Arbor.
Read more about Rami Krispin