Reader small image

You're reading from  Advanced Analytics with R and Tableau

Product typeBook
Published inAug 2017
Reading LevelIntermediate
PublisherPackt
ISBN-139781786460110
Edition1st Edition
Languages
Tools
Right arrow
Authors (3):
Ruben Oliva Ramos
Ruben Oliva Ramos
author image
Ruben Oliva Ramos

Ruben Oliva Ramos is a computer systems engineer from Tecnologico de Leon Institute, with a master's degree in computer and electronic systems engineering and a specialization in teleinformatics and networking from the University of Salle Bajio in Leon, Guanajuato, Mexico. He has more than 5 years of experience of developing web applications to control and monitor devices connected with Arduino and Raspberry Pi, using web frameworks and cloud services to build the Internet of Things applications. He is a mechatronics teacher at the University of Salle Bajio and teaches students of the master's degree in design and engineering of mechatronics systems. Ruben also works at Centro de Bachillerato Tecnologico Industrial 225 teaching subjects such as electronics, robotics and control, automation, and microcontrollers. He is a consultant and developer for projects in areas such as monitoring systems and datalogger data using technologies (such as Android, iOS, HTML5, and ASP.NET), databases (such as SQlite, MongoDB, and MySQL), web servers, hardware programming, and control and monitor systems for data acquisition and programming.
Read more about Ruben Oliva Ramos

Jen Stirrup
Jen Stirrup
author image
Jen Stirrup

Jen Stirrup is a data strategist and technologist, a Microsoft Most Valuable Professional (MVP), and a Microsoft Regional Director, a tech community advocate, a public speaker and blogger, a published author, and a keynote speaker. Jen is the founder of a boutique consultancy based in the UK, Data Relish, which focuses on delivering successful business intelligence and artificial intelligence solutions that add real value to customers worldwide. She has featured on the BBC as a guest expert on topics relating to data.
Read more about Jen Stirrup

View More author details
Right arrow

Chapter 7. Advanced Analytics with Unsupervised Learning

Would you like to know how to make predictions from a dataset? Alternatively, would you like to find exceptions, or outliers that you need to watch out for?

Neural networks are used in business to answer these business questions. They are used to make predictions from a dataset, or to find unusual patterns. They are best used for regression or classification business problems.

In this chapter, we will look at neural networks as a specific example of advanced analytics, and how they can be used to answer real-life business questions.

What are neural networks?


Neural networks are one of the most interesting machine learning models. Neural networks are inspired by the structures of the brain. Neural networks are algorithms that mimic the functioning of the brain. They are unsupervised algorithms, which means that we do not always know what the outputs should be.

Neural networks have layers, which can be categorized into the following:

  • Input

  • Middle

  • Output layers

The input layer consumes the data, and the output layer represents the result. The middle layer represents the part of the algorithm that indicates how the input layer gets to the output layer.

Different types of neural networks

The simplest type of neural network is known as a Feedforward Neural Network. It feeds information in one direction only, from the front to the back. This type of network is also known as a perceptron. The following figure illustrates a perceptron:

Neural network training process

Neural networks can also feed information back down through the layers...

Backpropagation and Feedforward neural networks


Training a neural network is an iterative process, which involves discovering values for its weights and its bias terms. These are used in conjunction with the input values to create outputs. After much iteration, the model is tested for the purposes of becoming a full production model that can be used to make predictions.

Training a neural network model is an iterative process, which is a key part of the Cross Industry Standard Process for Data Mining (CRISP-DM) as an integral part of the modeling phase. Training involves working out weights and bias values that lead the inputs towards the preferred output. As part of the training process, the model can be presented with the test data in order to evaluate its accuracy. This will help us to understand how well the model will perform when it is given new data, and we don't know the true output results.

During the training process, rows are presented to the neural network consecutively, one at...

Evaluating a neural network model


Another fundamental phase of the CRISP-DM methodology is the evaluation phase, which focuses on the quality of the model, and its ability to meet the overall business objectives. If the model can't meet the objectives, then it's important to understand if there is a business reason why the model doesn't meet the objectives, in addition to technical possibilities that might account for failure. It's also a good time to pause and consider the testing results that you have generated thus far. This is a crucial stage because it can reveal challenges that didn't appear before. That said, it is an interesting phase because you can find new and interesting things for future research directions. Therefore, it's important not to skip it!

Fortunately, we can visualize the results using Tableau so that the neural networks are easier to understand. There are several performance measures for neural networks, and we will explore these in more detail along with a discussion...

Neural network performance measures


In the meantime, however, let's make the concepts of the neural net clear by looking at the options for visualizing the results.

Receiver Operating Characteristic curve

Here is an example of a Receiver Operator Characteristic (ROC) curve, where we can see the data analysis and the changes we have in the data accordingly to the time.

The closer this curve is to the upper left corner, the better the model's performance is. It means it is better at identifying the true positive rate while minimizing the false positive rate. In this example, we can see that the model is performing well.

Precision and Recall curve

Precision and Recall curve are very useful for assessing models in terms of business questions. They offer more detail and insights into the model's performance. Here is an example:

Precision can be described as the fraction of times that the model classifies the number of cases correctly. It can be considered as a measure of confirmation, and it indicates...

Visualizing neural network results


Let's work through an example of a neural network, using publicly accessible data. We will use R to create the neural network, and then we will visualize the results in Tableau.

Neural network in R


Let's load up the libraries that we need. We are going to use the neuralnet package. The neuralnet package is a flexible package that is created for the training of neural networks using the backpropagation method. We discussed the backpropagation method previously in this chapter.

Let's install the package using the following command:

install.packages("neuralnet")

Now, let's load the library:

library(neuralnet)

We need to load up some data. We will use the iris quality dataset from the UCI website, which is installed along with your R installation. You can check that you have it, by typing in iris at the Command Prompt. You should get 150 rows of data.

If not, then download the data from the UCI website, and rename the file to iris.csv. Then, use the Import Dataset button on RStudio to import the data.

Now, let's assign the iris data to the data command. Now, let's look at the data to see if it is loaded correctly. It's enough to look at the first few rows of data, and...

Modeling and evaluating data in Tableau


Neural networks are often difficult to understand. When the data is loaded into Tableau, we can visually understand the distinctions made by the underlying model. Since we can easily load data into Tableau, we can do this on an ongoing basis.

In this example, we will use Tableau as part of the testing process. We will present the model with data, and see how well the R model can distinguish between the three types of iris. Once we have set up the Tableau workbook, we can load more data into the workbook, using the connect to data facility. This would help us to see if the model continues to distinguish between the model types, and we could continue to test the model on an ongoing basis.

Using Tableau to evaluate data

Let's load more data into Tableau. For our testing purposes, we will reuse the iris data that we used in the earlier example. However, if this was real life, this would not be the best practice for testing purposes. Here, we are reusing the...

Summary


In this chapter you have learned how to manipulate data using unsupervised learning techniques and algorithms applying R and Tableau. The usage Tableau is simple; we use it to visualize the analytics. Also we can develop data analytics using it and delivering the overall data science project, by helping business users to evaluate and understand the models that they have been given. We can get the data, analyze it, and evaluate it with a real approach.

In the next chapter we will see how to interpret the results and the numbers, when you have them, how to make them understandable for real applications in real life, applying the data in real situations and visualization.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Advanced Analytics with R and Tableau
Published in: Aug 2017Publisher: PacktISBN-13: 9781786460110
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Authors (3)

author image
Ruben Oliva Ramos

Ruben Oliva Ramos is a computer systems engineer from Tecnologico de Leon Institute, with a master's degree in computer and electronic systems engineering and a specialization in teleinformatics and networking from the University of Salle Bajio in Leon, Guanajuato, Mexico. He has more than 5 years of experience of developing web applications to control and monitor devices connected with Arduino and Raspberry Pi, using web frameworks and cloud services to build the Internet of Things applications. He is a mechatronics teacher at the University of Salle Bajio and teaches students of the master's degree in design and engineering of mechatronics systems. Ruben also works at Centro de Bachillerato Tecnologico Industrial 225 teaching subjects such as electronics, robotics and control, automation, and microcontrollers. He is a consultant and developer for projects in areas such as monitoring systems and datalogger data using technologies (such as Android, iOS, HTML5, and ASP.NET), databases (such as SQlite, MongoDB, and MySQL), web servers, hardware programming, and control and monitor systems for data acquisition and programming.
Read more about Ruben Oliva Ramos

author image
Jen Stirrup

Jen Stirrup is a data strategist and technologist, a Microsoft Most Valuable Professional (MVP), and a Microsoft Regional Director, a tech community advocate, a public speaker and blogger, a published author, and a keynote speaker. Jen is the founder of a boutique consultancy based in the UK, Data Relish, which focuses on delivering successful business intelligence and artificial intelligence solutions that add real value to customers worldwide. She has featured on the BBC as a guest expert on topics relating to data.
Read more about Jen Stirrup