Using multiple features
To recap the tools seen in the previous chapter, we reload all the packages and the Boston dataset:
In: import numpy as np import pandas as pd import matplotlib.pyplot as plt import matplotlib as mpl from sklearn.datasets import load_boston from sklearn import linear_model
If you are working on the code in an IPython Notebook (as we strongly suggest), the following magic command will allow you to visualize plots directly on the interface:
In: %matplotlib inline
We are still using the Boston dataset, a dataset that tries to explain different house prices in the Boston of the 70s, given a series of statistics aggregated at the census zone level:
In: boston = load_boston() dataset = pd.DataFrame(boston.data, columns=boston.feature_names) dataset['target'] = boston.target
We will always work by keeping with us a series of informative variables, the number of observation and variable names, the input data matrix, and the response vector at hand:
In: observations...