Packt+ | Advance your knowledge in tech

You're reading from Applied Supervised Learning with Python Use scikit-learn to build predictive models from real-world datasets and prepare yourself for the future of machine learning

Product type Paperback

Published in Apr 2019

Publisher

ISBN-13 9781789954920

Length 404 pages

Edition 1st Edition

Languages

Python

Tools

Scikit-learn

Concepts

Machine Learning

Authors (2):

Benjamin Johnston

Ishita Mathur

View More author details

Table of Contents (9) Chapters

Applied Supervised Learning with Python

Preface

1. Python Machine Learning Toolkit FREE CHAPTER

2. Exploratory Data Analysis and Visualization

3. Regression Analysis

4. Classification

5. Ensemble Modeling

6. Model Evaluation

Appendix

Chapter 6: Model Evaluation

Activity 15: Final Test Project

Solution

Import the relevant libraries:

import pandas as pd
import numpy as np
import json

%matplotlib inline
import matplotlib.pyplot as plt

from sklearn.preprocessing import OneHotEncoder
from sklearn.model_selection import RandomizedSearchCV, train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import (accuracy_score, precision_score, recall_score, confusion_matrix, precision_recall_curve)

Read the attrition_train.csv dataset. Read the CSV file into a DataFrame and print the .info() of the DataFrame:
```
data = pd.read_csv('attrition_train.csv')
data.info()
```
The output will be as follows:
Figure 6.33: Output of info()
Read the JSON file with the details of the categorical variables. The JSON file contains a dictionary, where the keys are the column names of the categorical features and the corresponding values are the list of categories in the feature. This file will help us one-hot encode the categorical...

The rest of the chapter is locked

Tech Concepts

Programming languages

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (2)

Benjamin Johnston

Benjamin Johnston is a senior data scientist for one of the world's leading data-driven MedTech companies and is involved in the development of innovative digital solutions throughout the entire product development pathway, from problem definition to solution research and development, through to final deployment. He is currently completing his Ph.D. in ML, specializing in image processing and deep convolutional neural networks. He has more than 10 years of experience in medical device design and development, working in a variety of technical roles, and holds a first-class honors bachelor's degree in both engineering and medical science from the University of Sydney, Australia.

See other products by Benjamin Johnston

Ishita Mathur

Ishita Mathur has worked as a data scientist for 2.5 years with product-based start-ups working with business concerns in various domains and formulating them as technical problems that can be solved using data and machine learning. Her current work at GO-JEK involves the end-to-end development of machine learning projects, by working as part of a product team on defining, prototyping, and implementing data science models within the product. She completed her masters' degree in high-performance computing with data science at the University of Edinburgh, UK, and her bachelor's degree with honors in physics at St. Stephen's College, Delhi.

See other products by Ishita Mathur