You're reading from Machine Learning with PyTorch and Scikit-Learn

Product type Book

Published in Feb 2022

Publisher Packt

ISBN-13 9781801819312

Pages 774 pages

Edition 1st Edition

Languages

Concepts

Machine Learning

Authors (3):

Sebastian Raschka

Yuxi (Hayden) Liu

Vahid Mirjalili

View More author details

Table of Contents (22) Chapters

Preface

1. Giving Computers the Ability to Learn from Data

2. Training Simple Machine Learning Algorithms for Classification

3. A Tour of Machine Learning Classifiers Using Scikit-Learn

4. Building Good Training Datasets – Data Preprocessing

5. Compressing Data via Dimensionality Reduction

6. Learning Best Practices for Model Evaluation and Hyperparameter Tuning

7. Combining Different Models for Ensemble Learning

8. Applying Machine Learning to Sentiment Analysis

9. Predicting Continuous Target Variables with Regression Analysis

10. Working with Unlabeled Data – Clustering Analysis

11. Implementing a Multilayer Artificial Neural Network from Scratch

12. Parallelizing Neural Network Training with PyTorch

13. Going Deeper – The Mechanics of PyTorch

14. Classifying Images with Deep Convolutional Neural Networks

15. Modeling Sequential Data Using Recurrent Neural Networks

16. Transformers – Improving Natural Language Processing with Attention Mechanisms

17. Generative Adversarial Networks for Synthesizing New Data

18. Graph Neural Networks for Capturing Dependencies in Graph Structured Data

19. Reinforcement Learning for Decision Making in Complex Environments

20. Other Books You May Enjoy

21. Index

Assessing feature importance with random forests

In previous sections, you learned how to use L1 regularization to zero out irrelevant features via logistic regression and how to use the SBS algorithm for feature selection and apply it to a KNN algorithm. Another useful approach for selecting relevant features from a dataset is using a random forest, an ensemble technique that was introduced in Chapter 3. Using a random forest, we can measure the feature importance as the averaged impurity decrease computed from all decision trees in the forest, without making any assumptions about whether our data is linearly separable or not. Conveniently, the random forest implementation in scikit-learn already collects the feature importance values for us so that we can access them via the feature_importances_ attribute after fitting a RandomForestClassifier. By executing the following code, we will now train a forest of 500 trees on the Wine dataset and rank the 13 features by their respective...

The rest of the chapter is locked

You're reading from Machine Learning with PyTorch and Scikit-Learn

Table of Contents (22) Chapters

Assessing feature importance with random forests

Unlock this book and the full library FREE for 7 days

Authors (3)

Personalised recommendations for you