Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Data Science Projects with Python - Second Edition

You're reading from  Data Science Projects with Python - Second Edition

Product type Book
Published in Jul 2021
Publisher Packt
ISBN-13 9781800564480
Pages 432 pages
Edition 2nd Edition
Languages
Author (1):
Stephen Klosterman Stephen Klosterman
Profile icon Stephen Klosterman

Table of Contents (9) Chapters

Preface
1. Data Exploration and Cleaning 2. Introduction to Scikit-Learn and Model Evaluation 3. Details of Logistic Regression and Feature Exploration 4. The Bias-Variance Trade-Off 5. Decision Trees and Random Forests 6. Gradient Boosting, XGBoost, and SHAP Values 7. Test Set Analysis, Financial Insights, and Delivery to the Client Appendix

Missing Data

As a final note on the use of both XGBoost and SHAP, one valuable trait of both packages is their ability to handle missing values. Recall that in Chapter 1, Data Exploration and Cleaning, we found that some samples in the case study data had missing values for the PAY_1 feature. So far, our approach has been to simply remove these samples from the dataset when building models. This is because, without specifically addressing the missing values in some way, the machine learning models implemented by scikit-learn cannot work with the data. Ignoring them is one approach, although this may not be satisfactory as it involves throwing data away. If it's a very small fraction of the data, this may be fine; however, in general, it's good to be able to know how to deal with missing values.

There are several approaches for imputing missing values of features, such as filling them in with the mean or mode of the non-missing values of that feature, or a randomly selected...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}