Chapter 3: Introduction to Machine Learning via Scikit-Learn
Activity 5: Generating Predictions and Evaluating the Performance of a Multiple Linear Regression Model
Solution:
- Generate predictions on the test data using the following:
predictions = model.predict(X_test)
2.    Plot the predicted versus actual values on a scatterplot using the following code:
import matplotlib.pyplot as plt
from scipy.stats import pearsonr
Â
plt.scatter(y_test, predictions)
plt.xlabel('Y Test (True Values)')
plt.ylabel('Predicted Values')
plt.title('Predicted vs. Actual Values (r = {0:0.2f})'.format(pearsonr(y_test, predictions)[0], 2))
plt.show()
Refer to the resultant output here:
Figure 3.33: A scatterplot of predicted versus actual values from a multiple linear regression model
Note
There is a much stronger linear correlation between the predicted and actual values in the multiple linear regression model (r = 0.93) relative to the simple linear regression model...