Reader small image

You're reading from  Forecasting Time Series Data with Facebook Prophet

Product typeBook
Published inMar 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800568532
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Greg Rafferty
Greg Rafferty
author image
Greg Rafferty

Greg Rafferty is a data scientist in San Francisco, California. With over a decade of experience, he has worked with many of the top firms in tech, including Google, Facebook, and IBM. Greg has been an instructor in business analytics on Coursera and has led face-to-face workshops with industry professionals in data science and analytics. With both an MBA and a degree in engineering, he is able to work across the spectrum of data science and communicate with both technical experts and non-technical consumers of data alike.
Read more about Greg Rafferty

Right arrow

Chapter 3: Non-Daily Data

When Prophet was first released, it assumed all data would be on a daily scale, with one row of data per day. It has since grown to handle many different granularities of data, but because of its historical conventions, there are few things to be cautious of when working with non-daily data.

In this chapter, you will look at monthly data (and in fact, any data that is measured in timeframes greater than a day) and see how to change the frequency of predictions to avoid unexpected results. You will also look at hourly data and observe an additional component in the components plot. Finally, you will learn how to handle data that has regular gaps along the time axis.

This chapter will cover the following:

  • Using monthly data
  • Using sub-daily data
  • Using data with regular gaps

Technical requirements

The data files and code for examples in this chapter can be found at https://github.com/PacktPublishing/Forecasting-Time-Series-Data-with-Facebook-Prophet.

Please refer to the Preface of this book for the technical requirements necessary to run the code examples.

Using monthly data

In Chapter 2, Getting Started with Facebook Prophet, we built our first Prophet model using the Mauna Loa dataset. The data was reported every day, which is what Prophet by default will expect and is therefore why we did not need to change any of Prophet's default parameters. In this next example, though, let's take a look at a new set of data that is not reported every day, the Air Passengers dataset, to see how Prophet handles this difference in data granularity.

This is a classic time series dataset spanning 1949 through 1960. It counts the number of passengers on commercial airlines each month during that period of explosive growth in the industry. The Air Passengers dataset, in contrast to the Mauna Loa dataset, has one observation per month. What happens if we attempt to predict future dates?

Let's create a model and plot the forecast to see what happens. We begin as we did with the Mauna Loa example, by importing the necessary libraries...

Using sub-daily data

In this section, we will be using data from the Divvy bike share program in Chicago, Illinois. The data contains the number of bicycle rides taken each hour from the beginning of 2014 through the end of 2018 and exhibits a general increasing trend along with very strong yearly seasonality. Because it is hourly data and there are very few rides overnight (sometimes zero per hour), the data does show a density of measurements at the low end:

Figure 3.4 – Divvy number of rides per hour

Using sub-daily data such as this is much the same as using super-daily data, as we did with the Air Passengers data previously. You as the analyst need to use the freq argument and adjust the periods in the make_future_dataframe method, and Prophet will do the rest. If Prophet sees at least two days of data and the spacing between data is less than one day, it will fit a daily seasonality.

Let's see this by making a simple forecast. We already...

Using data with regular gaps

Throughout your career, you may encounter datasets with regular gaps in reporting, particularly when the data was collected by humans who have working hours, personal hours, and sleeping hours. It simply may not be possible to collect measurements with perfect periodicity.

As you will see when we look at outliers in a later chapter, Prophet is robust in handling missing values. However, when that missing data occurs at regular intervals, Prophet will have no training data at all during those gaps to make estimations with. The seasonality will be constrained during periods where data exists but unconstrained during the gaps, and Prophet's predictions can see much larger fluctuations than the actual data displayed. Let's see this in action.

Suppose that Divvy's data had only been collected between the hours of 8am and 6pm each day. We can simulate this by removing data outside these hours from our DataFrame:

df = df[(df['ds&apos...

Summary

In this chapter, you took the lessons learned from the basic Mauna Loa model you built in Chapter 2, Getting Started with Facebook Prophet, and learned what changes you need to make when the period of your data is not daily. Specifically, you used the Air Passengers dataset to model monthly data and used the freq argument when making your future DataFrame in order to hold back Prophet from predicting daily.

Then, you used the hourly data from Divvy's bike share program to set the future frequency to hourly so that Prophet would increase the granularity of its prediction timescale. Finally, you simulated periodic missing data in the Divvy dataset and learned a different way to match the future DataFrame's schedule to that of the training data, in order to prevent Prophet from making unconstrained predictions.

Now that you know how to handle the different datasets you will encounter in this book, you're ready for the next topic! In the next chapter, you...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Forecasting Time Series Data with Facebook Prophet
Published in: Mar 2021Publisher: PacktISBN-13: 9781800568532
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Greg Rafferty

Greg Rafferty is a data scientist in San Francisco, California. With over a decade of experience, he has worked with many of the top firms in tech, including Google, Facebook, and IBM. Greg has been an instructor in business analytics on Coursera and has led face-to-face workshops with industry professionals in data science and analytics. With both an MBA and a degree in engineering, he is able to work across the spectrum of data science and communicate with both technical experts and non-technical consumers of data alike.
Read more about Greg Rafferty