Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Data Science Projects with Python - Second Edition

You're reading from  Data Science Projects with Python - Second Edition

Product type Book
Published in Jul 2021
Publisher Packt
ISBN-13 9781800564480
Pages 432 pages
Edition 2nd Edition
Languages
Author (1):
Stephen Klosterman Stephen Klosterman
Profile icon Stephen Klosterman

Table of Contents (9) Chapters

Preface
1. Data Exploration and Cleaning 2. Introduction to Scikit-Learn and Model Evaluation 3. Details of Logistic Regression and Feature Exploration 4. The Bias-Variance Trade-Off 5. Decision Trees and Random Forests 6. Gradient Boosting, XGBoost, and SHAP Values 7. Test Set Analysis, Financial Insights, and Delivery to the Client Appendix

4. The Bias-Variance Trade-Off

Activity 4.01: Cross-Validation and Feature Engineering with the Case Study Data

Solution:

  1. Select out the features from the DataFrame of the case study data.

    You can use the list of feature names that we've already created in this chapter, but be sure not to include the response variable, which would be a very good (but entirely inappropriate) feature:

    features = features_response[:-1]
    X = df[features].values
  2. Make a training/test split using a random seed of 24:
    X_train, X_test, y_train, y_test = \
    train_test_split(X, df['default payment next month'].values,
                     test_size=0.2, random_state=24)

    We'll use this going forward and reserve this test data as the unseen test set. By specifying the random seed, we can easily create separate notebooks with other modeling approaches using the same training data.

  3. Instantiate MinMaxScaler...
lock icon The rest of the chapter is locked
arrow left Previous Chapter
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime}