Packt+ | Advance your knowledge in tech

You're reading from R Deep Learning Projects

Product type Book

Published in Feb 2018

Publisher Packt

ISBN-13 9781788478403

Pages 258 pages

Edition 1st Edition

Languages

Concepts

Deep Learning

Table of Contents (11) Chapters

Title Page

Packt Upsell

Contributors

Preface

Handwritten Digit Recognition Using Convolutional Neural Networks

Traffic Sign Recognition for Intelligent Vehicles

Fraud Detection with Autoencoders

Text Generation Using Recurrent Neural Networks

Sentiment Analysis with Word Embeddings

Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Chapter 3. Fraud Detection with Autoencoders

In this chapter, we continue our journey into deep learning with R with autoencoders.

A classical autoencoder consists of three parts:

An encoding function, which compresses your data
A decoding function, which reconstructs data from a compressed version
A metric or distance, which calculates the difference between the information lost by compression on your data

We typically assume that all these involved functions are smooth enough to be able to use backpropagation or other gradient-based methods, although they need not be and we could use derivative-free methods to train them.

Note

Autoencoding is the process of summarizing information from a potentially large feature set into a smaller feature set.

Although the compression bit might remind you of algorithms, such as the MP3 compression algorithm, an important difference is that autoencoders are data specific. An autoencoder trained in pictures of cats and dogs will likely perform poorly in pictures...

Getting ready

In this chapter, we will introduce keras and tensorflow for R. keras is a model-level building, in that it provides a high-level interface to quickly develop deep learning models. Instead of implementing low-level operations such as convolutions and tensor products, it relies on Theano, TensorFlow or CNTK in the backend, and according to the development team, more backends would be supported in the future.

Why do you need a backend? Well, if the computation becomes more complicated, which is often the case in deep learning, you need to use different computation methods (known as computation graphs) and hardware (GPUs). For instructional purposes, all our sample codes run without GPU.

Installing Keras and TensorFlow for R

As per the official documentation, you can install Keras simply with:

devtools::install_github("rstudio/keras")

The Keras R interface uses tensorflow as a backend engine by default. To install both the core keras library and tensorflow, then do:

library(keras)...

Our first examples

Let's begin with a few simple examples to understand what is going on.

For some of us, it's very easy to get tempted to try the shiniest algorithms and do hyper-parameter optimization instead of the less glamorous step-by-step understanding.

A simple 2D example

Let's develop our intuition of how the autoencoder works with a simple two-dimensional example.

We first generate 10,000 points coming from a normal distribution with mean 0 and variance 1:

library(MASS)
library(keras)
Sigma <- matrix(c(1,0,0,1),2,2)
n_points <- 10000
df <- mvrnorm(n=n_points, rep(0,2), Sigma)
df <- as.data.frame(df)

The distribution of the values should look as follows:

Distribution of the variable V1 we just generated; the variable V2 looks fairly similar.

Distribution of the variables V1 and V2 we generated.

Let's spice things up a bit and add some outliers to the mixture. In many fraud applications, the fraud rate is about 1–5%, so we generate 1% of our samples as coming from a normal...

Credit card fraud detection with autoencoders

Fraud is a multi-billion dollar industry, with credit card fraud being probably the closest to our daily lives. Fraud begins with the theft of the physical credit card or with data that could compromise the security of the account, such as the credit card number, expiration date and security codes. A stolen card can be reported directly, if the victim knows that their card has been stolen, however, when the data is stolen, a compromised account can take weeks or even months to be used, and the victim then only knows from their bank statement that the card has been used.

Traditionally, fraud detection systems rely on the creation of manually engineered features by subject matter experts, working either directly with financial institutions or with specialized software vendors.

One of the biggest challenges in fraud detection is the availability of labelled datasets, which are often hard or even impossible to come by.

Our first fraud example comes...

Variational Autoencoders

Variational Autoencoders (VAE) are a more recent take on the autoencoding problem. Unlike autoencoders, which learn a compressed representation of the data, Variational Autoencoders learn the random process that generates such data, instead of learning an essentially arbitrary function as we previously did with our neural networks.

VAEs have also an encoder and decoder part. The encoder learns the mean and standard deviation of a normal distribution that is assumed to have generated the data. The mean and standard deviation are called latent variables because they are not observed explicitly, rather inferred from the data.

The decoder part of VAEs maps back these latent space points into the data. As before, we need a loss function to measure the difference between the original inputs and their reconstruction. Sometimes an extra term is added, called the Kullback-Leibler divergence, or simply KL divergence. The KL divergence computes, roughly, how much a probability...

Text fraud detection

Fraud has become an issue beyond the traditional transaction fraud. Many websites, for instance, rely on user reviews about services, such as restaurants, hotels or tourist attractions, that are monetized in different ways. If the users lose trust in those reviews, for example, by a business owner deliberately messing with the good reviews for his or her own business, then the website will find it hard to regain that trust and to remain profitable. Hence, it is important to detect such potential issues.

How can autoencoders help us with this? As before, the idea is to learn the representation of a normal review on a website, and then find those that do not fit the normal review. The issue with text data is that there is some processing to be done before. We will illustrate this with an example, which will also serve as a motivation for the different ways of modelling text that will be discussed in the next chapters.

From unstructured text data to a matrix

An issue with...

Summary

In this chapter, we learned that autoencoders are a technique used mainly in image reconstruction and denoising, to obtain compressed and summarized representations of the data. We saw that they are also used sometimes for fraud detection tasks. The outlier identification comes from measuring the reconstruction error, observing the distribution of the reconstruction error, we can set up thresholds for identifying the outliers and learn the probabilistic process that generates the data. Hence, Variational Autoencoders are also able to generate new data.