Chapter 3: Regression Analysis
Activity 5: Plotting Data with a Moving Average
Solution
Load the dataset into a pandas DataFrame from the CSV file:
df = pd.read_csv('austin_weather.csv') df.head()The output will show the initial five rows of the austin_weather.csv file:

Figure 3.74: The first five rows of the Austin weather data
Since we only need the Date and TempAvgF columns, we'll remove all others from the dataset:
df = df[['Date', 'TempAvgF']] df.head()
The output will be:

Figure 3.75: Date and TempAvgF columns of the Austin weather data
Initially, we are only interested in the first year's data, so we need to extract that information only. Create a column in the DataFrame for the year value, extract the year value as an integer from the strings in the Date column, and assign these values to the Year column. Note that temperatures are recorded daily:
df['Year'] = [int(dt[:4]) for dt in df.Date] df.head()
The output will be:

Figure 3.76: Extracting the year
Repeat this process to extract the month...