Implementing a regression forest
In Chapter 3, Predicting Online Ad Click-Through with Tree-Based Algorithms, we explored random forests as an ensemble learning method, by combining multiple decision trees that are separately trained and randomly subsampling training features in each node of a tree. In classification, a random forest makes a final decision by a majority vote of all tree decisions. Applied to regression, a random forest regression model (also called a regression forest) assigns the average of regression results from all decision trees to the final decision.
Here, we will use the regression forest package, RandomForestRegressor, from scikit-learn and deploy it in our California house price prediction example:
>>> from sklearn.ensemble import RandomForestRegressor
>>> regressor = RandomForestRegressor(n_estimators=100,
                                  max_depth=10,
                                  min_samples_split=3,
                    ... 
                                             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
     
         
                 
                 
                 
                 
                 
                 
                 
                 
                