You're reading from The Machine Learning Workshop - Second Edition

Product typeBook

Published inJul 2020

Reading LevelIntermediate

PublisherPackt

ISBN-139781839219061

Edition2nd Edition

Languages

Python

Tools

Jupyter

Concepts

Machine Learning

Author (1)

Hyatt Saleh

1. Introduction to Scikit-Learn

Activity 1.01: Selecting a Target Feature and Creating a Target Matrix

Solution:

Load the titanic dataset using the seaborn library:
```
import seaborn as sns
titanic = sns.load_dataset('titanic')
titanic.head(10)
```
The first couple of rows should look as follows:
Figure 1.22: An image showing the first 10 instances of the Titanic dataset
Select your preferred target feature for the goal of this activity.
The preferred target feature could be either survived or alive. This is mainly because both of them label whether a person survived the crash. For the following steps, the variable that's been chosen is survived. However, choosing alive will not affect the final shape of the variables.
Create both the features matrix and the target matrix. Make sure that you store the data from the features matrix in a variable, X, and the data from the target matrix in another variable, Y:
```
X = titanic.drop('survived',axis = 1)
Y = titanic...
```

2. Unsupervised Learning – Real-Life Applications

Activity 2.01: Using Data Visualization to Aid the Pre-processing Process

Solution:

Import all the required elements to load the dataset and pre-process it:
```
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
```
Load the previously downloaded dataset by using pandas' read_csv() function. Store the dataset in a pandas DataFrame named data:
```
data = pd.read_csv("wholesale_customers_data.csv")
```
Check for missing values in your DataFrame. Using the isnull() function plus the sum() function, count the missing values of the entire dataset at once:
```
data.isnull().sum()
```
The output is as follows:
```
Channel             0
Region              0
Fresh               0
Milk                0
Grocery             0
Frozen              0
Detergents_Paper    0
Delicassen          0
dtype: int64
```
As you can see from the preceding screenshot, there are no missing values in the dataset.
Check for outliers...

3. Supervised Learning – Key Steps

Activity 3.01: Data Partitioning on a Handwritten Digit Dataset

Solution:

Import all the required elements to split a dataset, as well as the load_digits function from scikit-learn to load the digits dataset. Use the following code to do so:
```
from sklearn.datasets import load_digits
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.model_selection import KFold
```
Load the digits dataset and create Pandas DataFrames containing the features and target matrices:
```
digits = load_digits()
X = pd.DataFrame(digits.data)
Y = pd.DataFrame(digits.target)
print(X.shape, Y.shape)
```
The shape of your features and target matrices should be as follows, respectively:
```
(1797, 64) (1797, 1)
```
Perform the conventional split approach, using a split ratio of 60/20/20%.
Using the train_test_split function, split the data into an initial train set and a test set:
```
X_new, X_test, \
Y_new, Y_test = train_test_split(X, Y, test_size...
```

4. Supervised Learning Algorithms: Predicting Annual Income

Activity 4.01: Training a Naïve Bayes Model for Our Census Income Dataset

Solution:

In a Jupyter Notebook, import all the required elements to load and split the dataset, as well as to train a Naïve Bayes algorithm:
```
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
```
Load the pre-processed Census Income dataset. Next, separate the features from the target by creating two variables, X and Y:
```
data = pd.read_csv("census_income_dataset_preprocessed.csv")
X = data.drop("target", axis=1)
Y = data["target"]
```
Note that there are several ways to achieve the separation of X and Y. Use the one that you feel most comfortable with. However, take into account that X should contain the features of all instances, while Y should contain the class labels of all instances.
Divide the dataset into training, validation, and...

5. Artificial Neural Networks: Predicting Annual Income

Activity 5.01: Training an MLP for Our Census Income Dataset

Solution:

Import all the elements required to load and split a dataset, to train an MLP, and to measure accuracy:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score

Using the preprocessed Census Income Dataset, separate the features from the target, creating the variables X and Y:
```
data = pd.read_csv("census_income_dataset_preprocessed.csv")
X = data.drop("target", axis=1)
Y = data["target"]
```
As explained previously, there are several ways to achieve the separation of X and Y, and the main thing to consider is that X should contain the features for all instances, while Y should contain the class label of all instances.
Divide the dataset into training, validation, and testing sets, using a split ratio of 10...

6. Building Your Own Program

Activity 6.01: Performing the Preparation and Creation Stages for the Bank Marketing Dataset

Solution:

Note

To ensure the reproducibility of the results available at https://packt.live/2RpIhn9, make sure that you use a random_state of 0 when splitting the datasets and a random_state of 2 when training the models.

Open a Jupyter Notebook and import all the required elements:

import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import precision_score

Load the dataset into the notebook. Make sure that you load the one that was edited previously, named bank-full-dataset.csv, which is also available at https://packt.live/2wnJyny:
```
data = pd.read_csv("bank-full-dataset.csv")
data.head(10)
```
The output is as follows:
Figure 6.8: A screenshot showing...

The rest of the chapter is locked

You have been reading a chapter from

The Machine Learning Workshop - Second Edition

Published in: Jul 2020Publisher: PacktISBN-13: 9781839219061

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Hyatt Saleh

Hyatt Saleh discovered the importance of data analysis for understanding and solving real-life problems after graduating from college as a business administrator. Since then, as a self-taught person, she not only works as a machine learning freelancer for many companies globally, but has also founded an artificial intelligence company that aims to optimize everyday processes. She has also authored Machine Learning Fundamentals, by Packt Publishing.
Read more about Hyatt Saleh

Other recommended products

Related to this chapter

Applied Deep Learning with PyTorch

Starting with the basics of deep learning and their various applications, Applied Deep Learning with PyTorch shows you how to solve trending tasks, such as image classification and natural language processing by understanding the different architectures of the neural networks.

BookApr 2019254 pages

The Deep Learning with PyTorch Workshop

With this hands-on, self-paced guide, you'll explore crucial deep learning topics and discover the structure and syntax of PyTorch. Challenging activities and interactive exercises will keep you motivated and encourage you to build intelligent applications effectively.

BookJul 2020330 pages

Machine Learning with scikit-learn Quick Start Guide

Scikit-learn is a robust machine learning library for the Python programming language. It provides a set of supervised and unsupervised learning algorithms. This book is the easiest way to learn how to deploy, optimize and evaluate all the important machine learning algorithms that scikit-learn provides.

BookOct 2018172 pages

The Applied TensorFlow and Keras Workshop

The Applied TensorFlow and Keras Workshop provides you with a blueprint to build an application that generates predictions using a deep learning model. You’ll learn to apply techniques to improve the model: add more data and features, change its architecture, or create a new model by changing the core components to meet your own requirements.

BookJul 2020174 pages

Machine Learning for Healthcare Analytics Projects

Machine Learning in the healthcare domain is booming because of its abilities to provide accurate and stabilized techniques. This book is packed with new methodologies to create efficient solutions for healthcare analytics. We will build five end-to-end projects to evaluate the efficiency of AI apps to carry out simple-to-complex healthcare analytics tasks.

BookOct 2018134 pages

Hands-On Ensemble Learning with Python

Ensemble learning can provide the necessary methods to improve the accuracy and performance of existing models. In this book, you'll understand how to combine different machine learning algorithms to produce more accurate results from your models.

BookJul 2019298 pages

The Deep Learning with Keras Workshop

The Deep Learning with Keras Workshop outlines a simple and straightforward way for you to understand deep learning with Keras. Starting with basic concepts such as data preprocessing, this book equips you with all the tools and techniques required for training your neural networks to solve various modeling problems.

BookJul 2020496 pages1

The Deep Learning with Keras Workshop

Cut through the noise and get real results with a step-by-step approach to understanding deep learning with Keras programming

BookFeb 2020446 pages

Master Data Science with Python

Data Science with Python will help you get comfortable with using the Python environment for data science. You will learn all the libraries that a data scientist uses on a daily basis. By the end of this course, you will be able to take a large raw dataset, clean it, manipulate it, and run machine learning algorithms to obtain results that influence business decisions.

BookJul 2019426 pages

The Supervised Learning Workshop

Taking an engaging and practical approach, The Supervised Learning Workshop teaches you how to predict the output of new data, based on the relationship and behavior of?existing datasets. You’ll learn at your own pace and use Python libraries and Jupyter to build intelligent predictive models.?

BookFeb 2020532 pages

The Applied Data Science Workshop

The Applied Data Science Workshop explores the key elements and interesting applications of data science techniques with the help of practical examples and interactive exercises. Following a hands-on approach, it allows you the freedom of analyzing data in the Jupyter Notebook effectively using many diverse open-source Python libraries.??

BookJul 2020352 pages

Applied Deep Learning with Keras

Applied Deep Learning with Keras takes you from a basic knowledge of machine learning and Python to an expert understanding of applying Keras to develop efficient deep learning solutions. This book teaches you new techniques to handle neural networks, and in turn, broadens your options as a data scientist.

BookApr 2019412 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages