We will examine the California housing dataset with gradient boosting trees. Our overall approach will be the same as before:
- Focus on important parameters in the gradient boosting algorithm:
- max_features
- max_depth
- min_samples_leaf
- learning_rate
- loss
 
- Create a parameter distribution where the most important parameters are varied.
- Perform a random grid search. If using an ensemble, keep the number of estimators low at first.
- Use the best parameters from the previous step with many estimators.
 
                                             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
             
     
         
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                 
                