Extracting features from dates with pandas
The values of datetime variables can be dates, time, or both. We’ll begin by focusing on those variables that contain dates. We rarely use raw dates with machine learning algorithms. Instead, we extract simpler features, such as the year, month, or day of the week, that allow us to capture insights such as seasonality, periodicity, and trends.
The pandas Python library is great for working with date and time. Utilizing the pandas dt module, we can access the datetime properties of a pandas Series to extract many features. However, to leverage this functionality, the variables need to be cast into a data type that supports these operations, such as datetime or timedelta.
Note
The datetime variables can be cast as objects, particularly when we load the data from a CSV file. To extract the date and time features that we will discuss throughout this chapter, it is necessary to recast the variables as datetime.
In this recipe...