Reader small image

You're reading from  Hands-On Deep Learning with TensorFlow

Product typeBook
Published inJul 2017
Reading LevelBeginner
PublisherPackt
ISBN-139781787282773
Edition1st Edition
Languages
Right arrow
Author (1)
Dan Van Boxel
Dan Van Boxel
author image
Dan Van Boxel

Dan Van Boxel is a data scientist and machine learning engineer with over 10 years of experience. He is most well-known for Dan Does Data, a YouTube livestream demonstrating the power and pitfalls of neural networks. He has developed and applied novel statistical models of machine learning to topics such as accounting for truck traffic on highways, travel time outlier detection, and other areas. Dan has also published research articles and presented findings at the Transportation Research Board and other academic journals.
Read more about Dan Van Boxel

Right arrow

Chapter 2. Deep Neural Networks

In the previous chapter, we looked at simple TensorFlow operations and how to use logistic regression on our font classification problem. In this chapter, we will dive into one of the most popular and successful machine learning approaches—neural networks. Using TensorFlow, we'll build both simple and deep neural networks to improve our model of the font classification problem. Here, we will put the basics of neural networks into practice. We will also build and train our first neural network with TensorFlow. We will then move on to a neural network with a hidden layer of neurons and understand it completely. When completed, you will have a better grasp of the following topics:

  • Basic neural networks

  • The single hidden layer model

  • The single hidden layer explained

  • The multiple hidden layer model

  • Results of the multiple hidden layer

In our first section, we'll review the basics of neural networks. You will learn common ways to transform input data, understand how neural...

Basic neural networks


Our logistic regression model worked well enough, but was fundamentally linear in nature. Doubling the intensity of a pixel doubled its contribution to the score, but we might only really care if a pixel was above a certain threshold or put more weight on changes to small values. Linearity may not capture all the nuances of the problem. One way to handle this issue is to transform our input with a nonlinear function. Let's look at a simple example in TensorFlow.

First, be sure to load the required modules (tensorflow, numpy, and math) and start an interactive session:

import tensorflow as tf
import numpy as np
import math

sess = tf.InteractiveSession()

In the following example, we create three five-long vectors of normal random numbers, truncated to keep them from being too extreme, with different centers:

x1 = tf.Variable(tf.truncated_normal([5],
                 mean=3, stddev=1./math.sqrt(5)))
x2 = tf.Variable(tf.truncated_normal([5],
                 mean=-1, stddev...

Single hidden layer model


Here, we'll put the basics of neural network into practice. We'll adapt the logistic regression TenserFlow code into a single hidden layer of neurons. Then, you'll learn the idea behind backpropagation to compute the weights, that is, train the net. Finally, you'll train your first true neural network in TensorFlow.

The TensorFlow code for this section should look familiar. It's just a slightly evolved version of the logistic regression code. Let's look at how to add a hidden layer of neurons that will compute nonlinear combinations of our input pixels.

You should start with a fresh Python session, execute the code to read in, and set up the data as in the logistic model. It's the same code, just copied to the new file:

import tensorflow as tf
import numpy as np
import math
from tqdm import tqdm
%autoindent
try:
    from tqdm import tqdm
except ImportError:
    def tqdm(x, *args, **kwargs):
        return x

You can always go back to the previous sections and remind...

Single hidden layer explained


In this section, we'll carefully look at the model we built. First, we'll verify the overall accuracy of our model, then we'll see where the model goes wrong. Finally, we'll visualize the weights associated with several neurons to see what they're looking for:

plt.figure(figsize=(6, 6))
plt.plot(train_acc,'bo')
plt.plot(test_acc,'rx')

Make sure that you've trained your model as we did in the previous section, if not, you might want to stop here and do that first. Because we evaluated our model accuracy every 10 training epochs and saved the result, it's now easy to explore how our model has evolved.

Using Matplotlib, we can plot both the training accuracy (the blue dots) and testing accuracy (the red dots) on the same figure:

Again, if you don't have Matplotlib, that's okay. You can just look at the array values themselves. Note that the training accuracy (blue in color) is usually a little better than the testing accuracy (red in color). This isn't surprising,...

The multiple hidden layer model


In this section, we'll show you how to build even more complex models with additional hidden layers. We'll adapt our single hidden layer model into a multilayer model known as a deep neural network. Then, we'll discuss choosing how many neurons and layers to use. Finally, we'll train the model itself, being patient, as this might take a while to compute.

Remember when we added a hidden layer of neurons to our logistic regression model? Well, we can do that again, adding another layer to our single hidden layer model. Once you have more than one layer of neurons, we call this a deep neural network. However, everything you learned before can be applied now. As in the previous sections of this chapter, you should make a fresh Python session and execute the code up to num_hidden1 in this section's code file. Then the fun starts.

Exploring the multiple hidden layer model

Let's start by changing the old num_hidden to num_hidden1 to indicate the number of neurons on...

Results of the multiple hidden layer


Now, we'll look into what's going on inside a deep neural network. First, we'll verify the model accuracy. Then, we'll visualize and study the pixel weights. Finally, we'll look at the output weights as well.

After you've trained your deep neural network, let's take a look at the model accuracy. We'll do this the same way that we did for the single hidden layer model. The only difference this time, is that we have many more saved samples of the training and testing accuracy, having gone from many more epochs.

As always, don't worry if you don't have Matplotlib; printing parts of the arrays is fine.

Understanding the multiple hidden layers graph

Execute the following code to see the result:

# Plot the accuracy curves
plt.figure(figsize=(6,6))
plt.plot(train_acc,'bo')
plt.plot(test_acc,'rx')

From the preceding output graph, we reach about 68 percent training accuracy and maybe 63 percent validation accuracy. This isn't too bad, but it does leave room for some...

Summary


In this chapter, we embraced deep learning with TensorFlow. Though we started with the simple model of one hidden layer of neurons, it didn't take you long to develop and train a deep neural network for the font classification problem.

You learned about the single and multiple hidden layer model and understood those in detail. You also understand the different types of neural networks and built and trained our first neural network with TensorFlow.

In the next chapter, we'll prove our model with convolutional neural networks, a powerful tool for image classification.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Deep Learning with TensorFlow
Published in: Jul 2017Publisher: PacktISBN-13: 9781787282773
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Dan Van Boxel

Dan Van Boxel is a data scientist and machine learning engineer with over 10 years of experience. He is most well-known for Dan Does Data, a YouTube livestream demonstrating the power and pitfalls of neural networks. He has developed and applied novel statistical models of machine learning to topics such as accounting for truck traffic on highways, travel time outlier detection, and other areas. Dan has also published research articles and presented findings at the Transportation Research Board and other academic journals.
Read more about Dan Van Boxel