Packt+ | Advance your knowledge in tech

You're reading from Mastering Predictive Analytics with R

Product typeBook

Published inJun 2015

Reading LevelExpert

Publisher

ISBN-139781783982806

Edition1st Edition

Languages

Tools

RStudio

Concepts

Predictive Analytics

Authors (2):

Rui Miguel Forte

View More author details

Chapter 4. Neural Networks

So far, we've looked at two of the most well-known methods used for predictive modeling. Linear regression is probably the most typical starting point for problems where the goal is to predict a numerical quantity. The model is based on a linear combination of input features. Logistic regression uses a nonlinear transformation of this linear feature combination in order to restrict the range of the output in the interval [0,1]. In so doing, it predicts the probability that the output belongs to one of two classes. Thus, it is a very well-known technique for classification.

Both methods share the disadvantage that they are not robust when dealing with many input features. In addition, logistic regression is typically used for the binary classification problem. In this chapter, we will introduce the concept of neural networks, a nonlinear approach to solving both regression and classification problems. They are significantly more robust when dealing with a higher...

The biological neuron

Neural network models draw their analogy from the organization of neurons in the human brain, and for this reason they are also often referred to as artificial neural networks (ANNs) to distinguish them from their biological counterparts. The key parallel is that a single biological neuron acts as a simple computational unit, but when a large number of these are combined together, the result is an extremely powerful and massively distributed processing machine capable of complex learning, known more commonly as the human brain. To get an idea of how neurons are connected in the brain, the following image shows a simplified picture of a human neural cell:

In a nutshell, we can think of a human neuron as a computational unit that takes in a series of parallel electrical signal inputs known as synaptic neurotransmitters coming in from the dendrites. The dendrites transmit signal chemicals to the soma or body of the neuron in response to the received synaptic neurotransmitters...

The artificial neuron

Using our biological analogy, we can construct a model of a computational neuron, and this model is known as the McCulloch-Pitts model of a neuron:

Note

Warren McCulloch and Walter Pitts proposed this model of a neural network as a computing machine in a paper titled A logical calculus of the ideas immanent in nervous activity, published by the Bulletin of Mathematical Biophysics in 1943.

This computational neuron is the simplest example of a neural network. We can construct the output function, y, of our neural network directly from following our diagram:

The function g() in our neural network is the activation function. Here, the specific activation function that is chosen is the step function:

When the linear weighted sum of inputs exceeds zero, the step function outputs 1, and when it does not, the function outputs -1. It is customary to create a dummy input feature x₀ which is always taken to be 1, in order to merge the bias or threshold w₀ into the main sum as follows...

Stochastic gradient descent

In the models we've seen so far, such as linear regression, we've talked about a criterion or objective function that the model must minimize while it is being trained. This criterion is also sometimes known as the cost function. For example, the least squares cost function for a model can be expressed as:

We've added a constant term of ½ in front of this for reasons that will become apparent shortly. We know from basic differentiation that when we are minimizing a function, multiplying the function by a constant factor does not alter the value of the minimum value of the function. In linear regression, just as with our perceptron model, our model's predicted is just the sum of a linear weighted combination of the input features. If we assume that our data is fixed and that the weights are variable and must be chosen so as to minimize our criterion, we can treat the cost function as being a function of the weights:

We have used the letter w to represent the model...

Multilayer perceptron networks

Multilayer neural networks are models that chain many neurons in order to create a neural architecture. Individually, neurons are very basic units, but when organized together, we can create a model significantly more powerful than the individual neurons.

As touched upon in the previous section, we build neural networks in layers and we distinguish between different kinds of neural networks primarily on the basis of the connections that exist between these layers and the types of neurons used. The following diagram shows the general structure of a multilayer perceptron (MLP) neural network, shown here for two hidden layers:

The first characteristic of the MLP network is that the information flows in a single direction from input layer to output layer. Thus, it is known as a feedforward neural network. This is in contrast to other neural network types, in which there are cycles that allow information to flow back to earlier neurons in the network as a feedback...

Predicting the energy efficiency of buildings

In this section, we will investigate how neural networks can be used to solve a real-world regression problem. Once again, we turn to the UCI Machine Learning Repository for our data set. We've chosen to try out the energy efficiency data set available at http://archive.ics.uci.edu/ml/datasets/Energy+efficiency. The prediction task is to use various building characteristics, such as surface area and roof area, in order to predict the energy efficiency of a building, which is expressed in the form of two different metrics—heating load and cooling load.

This is a good example for us to try out as we can demonstrate how neural networks can be used to predict two different outputs with a single network. The full attribute description of the data set is given in the following table:

Predicting glass type revisited

In Chapter 3, Logistic Regression, we analyzed the glass identification data set, whose task is to identify the type of glass comprising a glass fragment found at a crime scene. The output of this data set is a factor with several class levels corresponding to different types of glass. Our previous approach was to build a one-versus-all model using multinomial logistic regression. The results were not very promising, and one of the main points of concern was a poor model fit on the training data.

In this section, we will revisit this data set and see whether a neural network model can do better. At the same time, we will demonstrate how neural networks can handle classification problems as well:

> glass <- read.csv("glass.data", header = FALSE)
> names(glass) <- c("id", "RI", "Na", "Mg", "Al", "Si", "K", "Ca", 
                    "Ba", "Fe", "Type")
> glass$id <- NULL

Our output is a multiclass factor and so we will want to dummy-encode this...

Predicting handwritten digits

Our final application for neural networks will be the handwritten digit prediction task. In this task, the goal is to build a model that will be presented with an image of a numerical digit (0–9) and the model must predict which digit is being shown. We will use the MNIST database of handwritten digits from http://yann.lecun.com/exdb/mnist/.

From this page, we have downloaded and unzipped the two training files train-images-idx3-ubyte.gz and train-images-idx3-ubyte.gz. The former contains the data from the images and the latter contains the corresponding digit labels. The advantage of using this website is that the data has already been preprocessed by centering each digit in the image and scaling the digits to a uniform size. To load the data, we've used information from the website about the IDX format to write two functions:

read_idx_image_data <- function(image_file_path) {
  con <- file(image_file_path, "rb")
  magic_number <- readBin(con, what ...

Summary

In this chapter, we saw neural networks as a nonlinear method capable of solving both regression and classification problems. Motivated by the biological analogy to human neurons, we first introduced the simplest neural network, the perceptron. This is able to solve binary classification problems only when the two classes are linearly separable, something that we very rarely rely upon in practice.

By changing the function that transforms the linear weighted combination of inputs, namely the activation function, we discovered how to create different types of individual neurons. A linear activation function creates a neuron that performs linear regression, whereas the logistic activation function creates a neuron that performs logistic regression. By organizing and connecting neurons into layers, we can create multilayer neural networks that are powerful models for solving nonlinear problems.

The idea behind having hidden layers of neurons is that each hidden layer learns a new set of...

The rest of the chapter is locked

You have been reading a chapter from

Mastering Predictive Analytics with R

Published in: Jun 2015Publisher: ISBN-13: 9781783982806

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Rui Miguel Forte

Why do you think this reviewer is suitable for this book? Mr. Rui Miguel Forte has authored a book for Packt titled “Mastering Predictive Analytics with R”. The book has received a 5 star rating. He has 3 years experience as a Data Scientist. He has knowledge of Scala, Python, R, PHP. • Has the reviewer published any articles or blogs on this or a similar tool/technology ? [Provide Links and References] A brief of Unsupervised learning has been covered in his book “Mastering Predictive Analytics with R” https://www.safaribooksonline.com/library/view/mastering-predictive-analytics/9781783982806/ https://www.linkedin.com/profile/view?id=AAkAAAC5YUIBYL7LyLCWZ6LsR0ENJxByC2jU9AU&authType=NAME_SEARCH&authToken=c1Pg&locale=en_US&trk=tyah&trkInfo=clickedVertical%3Amynetwork%2CclickedEntityId%3A12149058%2CauthType%3ANAME_SEARCH%2Cidx%3A1-1-1%2CtarId%3A1444032603690%2Ctas%3ARui%20Miguel%20Forte • Feedback on the Outline (in case outline has been shared with the reviewer) The author said the outline is good to go. • Did the reviewer share any concerns or questions regarding the reviewing process? (related to the schedule, commitment, or any additional comments) No
Read more about Rui Miguel Forte

Rui Miguel Forte

Rui Miguel Forte is currently the chief data scientist at Workable. He was born and raised in Greece and studied in the UK. He is an experienced data scientist, having over 10 years of work experience in a diverse array of industries spanning mobile marketing, health informatics, education technology, and human resources technology. His projects have included predictive modeling of user behavior in mobile marketing promotions, speaker intent identification in an intelligent tutor, information extraction techniques for job applicant resumes and fraud detection for job scams. He currently teaches R, MongoDB, and other data science technologies to graduate students in the Business Analytics MSc program at the Athens University of Economics and Business. In addition, he has lectured in a number of seminars, specialization programs, and R schools for working data science professionals in Athens. His core programming knowledge is in R and Java, and he has extensive experience working with a variety of database technologies such as Oracle, PostgreSQL, MongoDB, and HBase. He holds a Master’s degree in Electrical and Electronic Engineering from Imperial College London and is currently researching machine learning applications in information extraction and natural language processing.
Read more about Rui Miguel Forte

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

Column name	Type	Definition
`relCompactness`	Numerical	Relative compactness
`surfArea`	Numerical	Surface area
`wallArea`	Numerical	Wall area
`roofArea`	Numerical	...