You're reading from Hands-On Predictive Analytics with Python

Product typeBook

Published inDec 2018

Reading LevelIntermediate

PublisherPackt

ISBN-139781789138719

Edition1st Edition

Languages

Python

Tools

TensorFlow

Concepts

Predictive Analytics

Author (1)

Alvaro Fuentes

Introducing Neural Nets for Predictive Analytics

In the last two chapters, we have presented some of the most basic and popular models for regression and classification tasks. In this chapter, we introduce a family of models based on neural networks. This family of models is the basis for the field of deep learning—an approach to machine learning behind some of the most exciting and recent advances in the field of artificial intelligence.

This chapter will give you enough knowledge to be able to use neural networks for predictive analytics; the point here is to present the fundamental concepts about these models and learn to train the most fundamental type of neural network—the multilayer perceptron (MLP).

First, we will cover the main concepts of neural networks when talking about the anatomy of an MLP; then we will discuss how these models learn to make predictions...

Technical requirements

Python 3.6 or higher
Jupyter Notebooks
Recent versions of the following Python libraries: NumPy, pandas, matplotlib, Seaborn, and scikit-learn
Recent installations of TensorFlow and Keras

Introducing neural network models

There is no question that lately neural networks and deep learning are terms that have attracted a lot of attention. Although there is definitely a lot of hype and misunderstanding of these technologies, they are behind some of the most important developments and breakthroughs in the field of artificial intelligence—self-driving cars, language translators, speech recognition, super-human level players in many board games, computer vision, and many other achievements are directly related to different kinds of deep learning models.

In this chapter, we will learn about one basic type of neural network model—the MLPs will use these models to solve predictive analytics problems, in particular, we will apply them to solve the two examples we have been working on within the book. After finishing this chapter, we will be able to include...

Introducing TensorFlow and Keras

Now we know that neural networks are a special type of machine learning model. Although, usually these models need huge amounts of data to start outperforming other machine learning approaches, one big advantage is that the process of training neural networks can make use of parallelization in hardware such as graphical processing units (GPUs), which do the operations needed for training neural networks faster than traditional CPUs. This is the reason that in the past few years, new specialized software frameworks have been developed with the capacity to make use of GPUs; examples of these frameworks are Theano, Caffe, and TensorFlow. These frameworks have allowed the deep learning models to be used for professionals outside specialized academic circles, thus democratizing the use of these powerful models. In this section, we introduce the two...

Regressing with neural networks

We will again use our diamonds dataset. Although this is a small dataset and MLP is perhaps a model that is too complicated for this problem, there is no reason we could not use an MLP to solve it; in addition to this, remember that back when we defined the hypothetical problem, we established that the stakeholders wanted a model that was as accurate as possible in their predictions, so let's see how accurate we can get with an MLP. As always, let's import the libraries we will use:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os
%matplotlib inline

Now, since we are beginning from scratch, load and prepare the dataset:

DATA_DIR = '../data'
FILE_NAME = 'diamonds.csv'
data_path = os.path.join(DATA_DIR, FILE_NAME)
diamonds = pd.read_csv(data_path)
## Preparation done from Chapter...

Classification with neural networks

Now, let's perform our classification task using a neural network. As you will see, the only change necessary in an MLP for it to be able to perform classification is in the output layer:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os
%matplotlib inline

As always, let's start from scratch and import and prepare our data:

# Loading the dataset
DATA_DIR = '../data'
FILE_NAME = 'credit_card_default.csv'
data_path = os.path.join(DATA_DIR, FILE_NAME)
ccd = pd.read_csv(data_path, index_col="ID")
ccd.rename(columns=lambda x: x.lower(), inplace=True)
ccd.rename(columns={'default payment next month':'default'}, inplace=True)

# getting the groups of features
bill_amt_features = ['bill_amt'+ str(i) for i in range(1,7)]
pay_amt_features = [&apos...

The dark art of training neural networks

From the results we got, we can see that there is a clear symptom of overfitting—the training accuracy looks great (91%), but the testing accuracy is lower than even random guessing. The most likely two causes for this are as follows:

The model has too many parameters
The model has been trained for too long

Since we are overfitting, we need to try some regularization technique; the simplest in the case of neural networks is training the model for fewer epochs. Now, let's get the initial weights and biases of the network back to the original values (the ones we saved in the a.h5 file):

nn_classifier.load_weights('class_initial_w.h5')

Now the weights have been reset, let's train our model again, this time only for 50 epochs:

batch_size = 64
n_epochs = 50
nn_classifier.compile(loss='binary_crossentropy&apos...

Summary

In this chapter, we introduced the most fundamental type of deep learning model—the MLP. We covered a lot of new concepts related to this power class of models such as deep learning, neural network models, and the activation functions of neurons. We also learned about TensorFlow, which is a framework to train deep learning models; we used it as a backend for running the calculations necessary to train our models. We covered Keras, where we first build a network, and then we compile it (indicating the loss and optimizer), and finally, we train the model. Lastly, we covered dropout, which is a regularization technique that is often used with neural networks, although it works best with very large networks. To conclude, neural networks are hard to train because they involve making many decisions; a lot of practice and knowledge is needed to be able to use these models...

Bengio, Y (2012). Practical recommendations for gradient-based training of deep architectures. Neural networks: Tricks of the trade (pp. 437-478).
Chollet, F (2017). Deep Learning with Python, Manning Publications.
Glorot, X, and Bengio, Y (2010, March). Understanding the difficulty of training deep feedforward neural networks. From the proceedings of the 13^th international conference on artificial intelligence and statistics (pp. 249-256).
Keskar, N. S., et al. (2016). On large-batch training for deep learning: Generalization gap and sharp minima. arXiv preprint arXiv:1609.04836.

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Predictive Analytics with Python

Published in: Dec 2018Publisher: PacktISBN-13: 9781789138719

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Alvaro Fuentes

Alvaro Fuentes is a senior data scientist with a background in applied mathematics and economics. He has more than 14 years of experience in various analytical roles and is an analytics consultant at one of the ‘Big Three' global management consulting firms, leading advanced analytics projects in different industries like banking, technology, and consumer goods. Alvaro is also an author and trainer in analytics and data science and has published courses and books, such as 'Become a Python Data Analyst' and 'Hands-On Predictive Analytics with Python'. He has also taught data science and related topics to thousands of students both on-site and online through different platforms such as Springboard, Simplilearn, Udemy, and BSG Institute, among others.
Read more about Alvaro Fuentes

Other recommended products

Related to this chapter

Mastering Predictive Analytics with scikit-learn and TensorFlow

In this book, you will find a range of methods to improve the performance of almost any predictive model, from ensemble methods to dimensionality reduction and cross-validation. You will learn the tools to produce advanced predictive models. In addition, you will dive into the exiting field of Deep Learning using TensorFlow.

BookSep 2018154 pages

Become a Python Data Analyst

Become a Python Data Analyst book introduces you to the mainstream libraries of Python’s Data Science stack. With proven examples and real-world datasets, this book teaches how to effectively perform data manipulation, visualize and analyze data patterns and brings you to the ladder of advanced topics like Predictive Analytics.

BookAug 2018178 pages

Interactive Dashboards and Data Apps with Plotly and Dash

Learn how to design and build Dash apps from scratch with this practical book that covers the different functionalities of Plotly and Dash for building dashboards and data apps. You’ll start by exploring the Dash ecosystem and go on to build a fully functional app as you discover options for fine-tuning and extending your app using new techniques.

BookMay 2021364 pages

Machine Learning with scikit-learn Quick Start Guide

Scikit-learn is a robust machine learning library for the Python programming language. It provides a set of supervised and unsupervised learning algorithms. This book is the easiest way to learn how to deploy, optimize and evaluate all the important machine learning algorithms that scikit-learn provides.

BookOct 2018172 pages

Data Science Projects with Python

Ideal for anyone who is just getting started with machine learning, this hands-on data science book will give you experience building predictive models using industry-standard tools and techniques. It will help you develop the skills and understanding to generate valuable insights and make data-driven business decisions.

BookJul 2021432 pages

Data Science Projects with Python

Data Science Projects with Python will help you build a toolkit for solving data science problems with Python. You will learn how to implement machine learning techniques for deriving insights from data. These skills will help you develop the kind of state-of-the-art predictive models that are used to deliver value to businesses across industries.

BookApr 2019374 pages

Ensemble Machine Learning Cookbook

This book uses a recipe-based approach to showcase the power of machine learning algorithms to build ensemble models using Python libraries. Through this book, you will be able to pick up the code, understand in depth how it works, execute and implement it efficiently. This will be a desk reference to implement a wide range of tasks and solve the common and uncommon problems in ensemble machine learning domain.

BookJan 2019336 pages

Feature Engineering Made Easy

Feature engineering is the most important step in creating powerful machine learning systems. This book will take you through the entire feature-engineering journey to make your machine learning much more systematic and effective.

BookJan 2018316 pages

scikit-learn Cookbook

scikit-learn has evolved as a robust library for machine learning applications in python with support for a wide range of supervised and unsupervised learning algorithms. This edition brings to you the various enhancements to its model implementations, API and bug fixes in the latest major release of scikit-learn to support Python. This book covers easy to follow recipes right from mathematical operations to implementing various supervised, unsupervised and deep learning algorithms with scikit-learn. Get practical hands-on knowledge to implement various models and algorithms like Multi-Layer Perceptrons, time-series split, MAE criterion for regression, criteria for gradient boosting, Classifier, Regressor, and much more.

BookNov 2017374 pages

Python Data Mining Quick Start Guide

This book is an introduction to data mining and its practical demonstration of working with real-world data sets. With this book, you will be able to extract useful insights using common Python libraries. You will also learn key stages like data loading, cleaning, analysis, visualization to build an efficient data mining pipeline.

BookApr 2019188 pages

Data Science Crash Course for Beginners

This course lays the groundwork for further study into data science for those students with little to no experience. Through step-by-step instructions, numerous exercises, and real-world examples, this book helps you master the basics of data science and implement those essential techniques in Python.

BookMar 2021310 pages

Applied Deep Learning with Keras

Applied Deep Learning with Keras takes you from a basic knowledge of machine learning and Python to an expert understanding of applying Keras to develop efficient deep learning solutions. This book teaches you new techniques to handle neural networks, and in turn, broadens your options as a data scientist.

BookApr 2019412 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages