Home Data Hands-On Deep Learning with TensorFlow

Hands-On Deep Learning with TensorFlow

By Dan Van Boxel
books-svg-icon Book
eBook $29.99 $20.98
Print $38.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $29.99 $20.98
Print $38.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
About this book
Dan Van Boxel’s Deep Learning with TensorFlow is based on Dan’s best-selling TensorFlow video course. With deep learning going mainstream, making sense of data and getting accurate results using deep networks is possible. Dan Van Boxel will be your guide to exploring the possibilities with deep learning; he will enable you to understand data like never before. With the efficiency and simplicity of TensorFlow, you will be able to process your data and gain insights that will change how you look at data. With Dan’s guidance, you will dig deeper into the hidden layers of abstraction using raw data. Dan then shows you various complex algorithms for deep learning and various examples that use these deep neural networks. You will also learn how to train your machine to craft new features to make sense of deeper layers of data. In this book, Dan shares his knowledge across topics such as logistic regression, convolutional neural networks, recurrent neural networks, training deep networks, and high level interfaces. With the help of novel practical examples, you will become an ace at advanced multilayer networks, image recognition, and beyond.
Publication date:
July 2017
Publisher
Packt
Pages
174
ISBN
9781787282773

 

Chapter 1. Getting Started

TensorFlow is a new machine learning and graph computation library recently released by Google. Its Python interface ensures the elegant design of common models, while its compiled backend ensures speed.

Let's take a glimpse at the techniques you'll learn and the models you'll build as you apply TensorFlow.

 

Installing TensorFlow


In this section, you will learn what TensorFlow is, how to install it, and how to build simple models and do simple computations. Further, you will learn how to build a logistic regression model for classification, and introduce a machine learning problem to help us learn TensorFlow.

We're going to learn what kind of library TensorFlow is and install it on our own Linux machine, or a free instance of CoCalc if you don't have access to a Linux machine.

TensorFlow – main page

First, what is TensorFlow? TensorFlow is a new machine learning library put out by Google. It is designed to be very easy to use and is very fast. If you go to the TensorFlow website, tensorflow.org, you will have access to a wealth of information about what TensorFlow is and how to use it. We'll be referring to this often, particularly the documentation.

TensorFlow – the installation page

Before we get started with TensorFlow, note that you need to install it, as it probably doesn't come preinstalled on your operating system. So, if you go to the Install tab on the TensorFlow web page, click on Installing TensorFlow on Ubuntu, and then click on "native" pip, you will learn how to install TensorFlow.

Installing TensorFlow is very challenging, even for experienced system administrators. So, I highly recommend using something like the pip installation; alternatively, if you're familiar with Docker, use the Docker installation. You can install TensorFlow from the source, but this can be very difficult. We will install TensorFlow using a precompiled binary called a wheel file . You can install this file using Python's pip module installer.

Installing via pip

For the pip installation, you have the option of using either a Python 2 or Python 3 version. Also, you can choose between the CPU and GPU version. If your computer has a powerful graphics card, the GPU version may be for you.

However, you need to check that your graphics card is compatible with TensorFlow. If it's not, it's fine; everything in this series can be done with just the CPU version.

Note

We can install TensorFlow by using the pip install tensorflow command (based on your CPU or GPU support and pip version), as shown in the preceding screenshot.

So, if you copy the following line for TensorFlow, you can install it as well:

# Python 3.4 installation
sudo pip3 install --upgrade \
https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.2.1-cp34-cp34m-linux_x86_64.whl

If you don't have Python 3.4, as the wheel file called for, that's okay. You can probably still use the same wheel file. Let's take a look at how to do this for Python 3.5. First, you just need to download the wheel file directly, by either putting the following URL in your browser or using a command-line program, such as wget, as we're doing here:

wget  https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.2.1-cp34-cp34m-linux_x86_64.whl

If you download this, it will very quickly be grabbed by your computer.

Now all you need to do is change the name of the file from cp34, which stands for Python 3.4, to whichever version of Python 3 you're using. In this case, we'll change it to a version using Python 3.5, so we'll change 4 to 5:

mv tensorflow-1.2.1-cp34-cp34m-linux_x86_64.whl tensorflow-1.2.1-cp35-cp35m-linux_x86_64.whl

Now you can install TensorFlow for Python 3.5 by simply changing the installation line here to pip3 install and the name of the new wheel file after changing it to 3.5:

sudo pip3 install ./tensorflow-1.2.1-cp35-cp35m-linux_x86_64.whl

We can see this works just fine. Now you've installed TensorFlow.

If your installation somehow becomes corrupted later, you can always jump back to this segment to remind yourself about the steps involved in the installation.

Installing via CoCalc

If you don't have administrative or installation rights on your computer but still want to try TensorFlow, you can try running TensorFlow over the web in a CoCalc instance. If you go to https://cocalc.com/ and create a new account, you can create a new project. This will give you a sort of a virtual machine that you can play around with. Conveniently, TensorFlow is already installed in the Anaconda 3 kernel.

Let's create a new project called TensorFlow. Click on +Create new project…, enter a title for your project, and click on Create Project. Now we can go into our project by clicking on the title. It will take a couple of seconds to load.

Click on +New to create a new file. Here, we'll create a Jupyter notebook:

Jupyter is a convenient way to interact with IPython and the primary means of using CoCalc for these computations. It may take a few seconds to load.

When you get to the interface shown in the following screenshot, the first thing you need to do is change the kernel to Anaconda Python 3 by going to Kernel | Change kernel… | Python 3 (Anaconda):

This will give you the proper dependencies to use TensorFlow. It may take a few seconds for the kernel to change. Once you are connected to the new kernel, you can type import tensorflow in the cell and go to Cell | Run Cells to check whether it works:

If your Jupyter notebook takes a long time to load, you can instead create a Terminal in CoCalc using the button shown in the following screenshot:

Once there, type anaconda3 to switch environments, then type ipython3 to launch an interactive Python session, as shown in the following screenshot:

You can easily work here, although you won't be able to visualize the output. Type import tensorflow in the Terminal and off you go.

So far in this section, you've learned what TensorFlow is and how to install it, either locally or on a virtual machine on the web. Now we're ready to explore simple computations in TensorFlow.

 

Simple computations


First, we're going to take a look at the tensor object type. Then we'll have a graphical understanding of TensorFlow to define computations. Finally, we'll run the graphs with sessions, showing how to substitute intermediate values.

Defining scalars and tensors

The first thing you need to do is download the source code pack for this book and open the simple.py file. You can either use this file to copy and paste lines into TensorFlow or CoCalc, or type them directly yourselves. First, let's import tensorflow as tf. This is a convenient way to refer to it in Python. You'll want to hold your constant numbers in tf.constant calls. For example, let's do a = tf.constant(1) and b = tf.constant(2):

import tensorflow as tf
# You can create constants in TF to hold specific values
a = tf.constant(1)
b = tf.constant(2)

Of course, you can add and multiply these to get other values, namely c and d:

# Of course you can add, multiply, and compute on these as you like
c = a + b
d = a * b

TensorFlow numbers are stored in tensors, a fancy term for multidimensional arrays. If you pass a Python list to TensorFlow, it does the right thing and converts it into an appropriately dimensioned tensor. You can see this illustrated in the following code:

# TF numbers are stored in "tensors", a fancy term for multidimensional arrays. If you pass TF a Python list, it can convert it
V1 = tf.constant([1., 2.])   # Vector, 1-dimensional
V2 = tf.constant([3., 4.])   # Vector, 1-dimensional
M = tf.constant([[1., 2.]])             # Matrix, 2d
N = tf.constant([[1., 2.],[3.,4.]])     # Matrix, 2d
K = tf.constant([[[1., 2.],[3.,4.]]])   # Tensor, 3d+

The V1 vector, a one-dimensional tensor, is passed as a Python list of [1. , 2.]. The dots here just force Python to store the number as decimal values rather than integers. The V2 vector is another Python list of [3. , 4. ]. The M variable is a two-dimensional matrix made from a list of lists in Python, creating a two-dimensional tensor in TensorFlow. The N variable is also a two-dimensional matrix. Note that this one actually has multiple rows in it. Finally, K is a true tensor, containing three dimensions. Note that the final dimension contains just one entry, a single two-by-two box.

Don't worry if this terminology is a bit confusing. Whenever you see a strange new variable, you can jump back to this point to understand what it might be.

Computations on tensors

You can also do simple things, such as add tensors together:

V3 = V1 + V2

Alternatively, you can multiply them element-wise, so each common position is multiplied together:

# Operations are element-wise by default
M2 = M * M

For true matrix multiplication, however, you need to use tf.matmul, passing in your two tensors as arguments:

NN = tf.matmul(N,N)

Doing computation

Everything so far has just specified the TensorFlow graph; we haven't yet computed anything. To do this, we need to start a session in which the computations will take place. The following code creates a new session:

sess = tf.Session()

Once you have a session open, doing: sess.run(NN) will evaluate the given expression and return an array. We can easily send this to a variable by doing the following:

output = sess.run(NN)
print("NN is:")
print(output)

If you run this cell now, you should see the correct tensor array for the NN output on the screen:

When you're done using your session, it's good to close it, just like you would close a file handle:

# Remember to close your session when you're done using it
sess.close()

For interactive work, we can use tf.InteractiveSession() like so:

sess = tf.InteractiveSession()

You can then easily compute the value of any node. For example, entering the following code and running the cell will output the value of M2:

# Now we can compute any node
print("M2 is:")
print(M2.eval())

Variable tensors

Of course, not all our numbers are constant. To update weights in a neural network, for example, we need to use tf.Variable to create the appropriate object:

W = tf.Variable(0, name="weight")

Note that variables in TensorFlow are not initialized automatically. To do so, we need to use a special call, namely tf.global_variables_initializer(), and then run that call with sess.run():

init_op = tf.global_variables_initializer()
sess.run(init_op)

This is to put a value in that variable. In this case, it will stuff a 0 value into the W variable. Let's just verify that W has that value:

print("W is:")
print(W.eval())

You should see an output value for W of 0 in your cell:

Let's see what happens when you add a to it:

W += a
print("W after adding a:")
print(W.eval())

Recall that a is 1, so you get the expected value of 1 here:

Let's add a again, just to make sure we can increment and that it's truly a variable:

W += a
print("W after adding a:")
print(W.eval())

Now you should see that W is holding 2, as we have incremented it twice with a:

Viewing and substituting intermediate values

You can return or supply arbitrary nodes when doing a TensorFlow computation. Let's define a new node but also return another node at the same time in a fetch call. First, let's define our new node E, as shown here:

E = d + b # 1*2 + 2 = 4

Let's take a look at what E starts as:

print("E as defined:")
print(E.eval())

You should see that, as expected, E equals 4. Now let's see how we can pass in multiple nodes, E and d, to return multiple values from a sess.run call:

# Let's see what d was at the same time
print("E and d:")
print(sess.run([E,d]))

You should see multiple values, namely 4 and 2, returned in your output:

Now suppose we want to use a different intermediate value, say for debugging purposes. We can use feed_dict to supply a custom value to a node anywhere in our computation when returning a value. Let's do that now with d equals 4 instead of 2:

# Use a custom d by specifying a dictionary
print("E with custom d=4:")
print(sess.run(E, feed_dict = {d:4.}))

Remember that E equals d + b and the values of d and b are both 2. Although we've inserted a new value of 4 for d, you should see that the value of E will now be output as 6:

You have now learned how to do core computations with TensorFlow tensors. It's time to take the next step forward by building a logistic regression model.

 

Logistic regression model building


Okay, let's get started with building a real machine learning model. First, we'll see the proposed machine learning problem: font classification. Then, we'll review a simple algorithm for classification, called logistic regression. Finally, we'll implement logistic regression in TensorFlow.

Introducing the font classification dataset

Before we jump in, let's load all the necessary modules:

import tensorflow as tf
import numpy as np

If you're copying and pasting to IPython, make sure your autoindent property is set to OFF:

%autoindent

The tqdm module is optional; it just shows nice progress bars:

try:
    from tqdm import tqdm
except ImportError:
    def tqdm(x, *args, **kwargs):
        return x

Next, we'll set a seed of 0, just to get consistent data splitting from run to run:

# Set random seed
np.random.seed(0)

In this book, we've provided a dataset of the images of characters using five fonts. For convenience, these are stored in a compressed NumPy file (data_with_labels.npz), which can be found in the download package of this book. You can easily load these into Python with numpy.load:

# Load data
data = np.load('data_with_labels.npz')
train = data['arr_0']/255.
labels = data['arr_1']

The train variable here holds the actual pixel values scaled from 0 to 1, and labels holds the type of font that it was; therefore, it'll be either 0, 1, 2, 3, or 4, as there are five fonts in total. You can print out these values, so you can look at them using the following code:

# Look at some data
print(train[0])
print(labels[0])

However, that's not very instructive, as most of the values are zeroes and only the central part of the screen contains the image data:

If you have Matplotlib installed, now is a good place to import it. We'll use plt.ion() to automatically bring up figures when needed:

# If you have matplotlib installed
import matplotlib.pyplot as plt
plt.ion()

Here are some example images of characters from each font:

Yeah, they're pretty flashy. In the dataset, each image is represented as a 36 x 36 two-dimensional matrix of pixel darkness values. The 0 value represents a white pixel, while 255 represents a black pixel. Everything in between is a shade of gray. Here's the code to display these fonts on your own machine:

# Let's look at a subplot of one of A in each font
f, plts = plt.subplots(5, sharex=True)
c = 91
for i in range(5):
    plts[i].pcolor(train[c + i * 558],
                   cmap=plt.cm.gray_r)

If your plot appears really wide, you can easily resize the window just using your mouse. It's often much more work to resize it ahead of time in Python if you're simply plotting interactively. Our goal is to decide which font an image belongs to, given that we have many other labeled images of the fonts. To expand the dataset and help avoid overfitting, we have also jittered each character around in the 36 x 36 area, giving us nine times as many data points.

It may be helpful to come back to this after working with later models. It's important to keep the original data in mind, no matter how advanced the final model is.

Logistic regression

If you're familiar with linear regression, you're halfway toward understanding logistic regression. Basically, we're going to assign a weight to each pixel in the image, then take the weighted sum of those pixels (beta for weights and X for pixels). This will give us a score for that image being a particular font. Every font will have its own set of weights, as they will value pixels differently. To convert these scores into proper probabilities (represented by Y), we will use what's called the softmax function to force their sum to be between 0 and 1, as illustrated next. Whatever probability is the greatest for a particular image, we will classify it into the associated class.

You can read more about the theory of logistic regression in most statistical modeling textbooks. Here is its formula:

One good reference that focuses on applications is William H. Greene's Econometric Analysis, Pearson, published in the year 2012.

Getting data ready

Implementing logistic regression is pretty easy in TensorFlow and will serve as scaffolding for more complex machine learning algorithms. First, we need to convert our integer labels into a one-hot format. This means, instead of labeling an image with font class 2, we transform the label into [0, 0, 1, 0, 0]. That is, we stick 1 in position two (note 0-up counting is common in computer science) and 0 for every other class. Here's the code for our to_onehot function:

def to_onehot(labels,nclasses = 5):
    '''
    Convert labels to "one-hot" format.
    >>> a = [0,1,2,3]
    >>> to_onehot(a,5)
    array([[ 1.,  0.,  0.,  0.,  0.],
           [ 0.,  1.,  0.,  0.,  0.],
           [ 0.,  0.,  1.,  0.,  0.],
           [ 0.,  0.,  0.,  1.,  0.]])
    '''
    outlabels = np.zeros((len(labels),nclasses))
    for i,l in enumerate(labels):
        outlabels[i,l] = 1
    return outlabels

With this done, we can go ahead and call the function:

onehot = to_onehot(labels)

For the pixels, we don't really want a matrix in this case, so we'll flatten the 36 x 36 numbers into a one-dimensional vector of length 1,296, but this will come a little bit later. Also, recall that we've rescaled the pixel values of 0-255 so that they fall between 0 and 1.

Okay, our final piece of preparation is to split our dataset into training and validation sets. This will help us catch overfitting later on. The training set will help us determine the weights in our logistic regression model, and the validation set will just be used to confirm that those weights are reasonably correct on new data:

# Split data into training and validation
indices = np.random.permutation(train.shape[0])
valid_cnt = int(train.shape[0] * 0.1)
test_idx, training_idx = indices[:valid_cnt],\
                         indices[valid_cnt:]
test, train = train[test_idx,:],\
              train[training_idx,:]
onehot_test, onehot_train = onehot[test_idx,:],\
                        onehot[training_idx,:]

Building a TensorFlow model

Okay, let's kick off the TensorFlow code by creating an interactive session:

sess = tf.InteractiveSession()

With this, we've started our first model in TensorFlow.

We're going to use a placeholder variable for x, which represents our input images. This is just to tell TensorFlow that we will supply the value for this node via feed_dict later on:

# These will be inputs
## Input pixels, flattened
x = tf.placeholder("float", [None, 1296])

Also, note that we can specify the shape of this tensor, and here we have used None as one of the sizes. The None size allows us to send an arbitrary number of data points into the algorithm at once for batch processing. We'll use the variable y_ likewise to hold our known labels to be used for training later on:

## Known labels
y_ = tf.placeholder("float", [None,5])

To perform logistic regression, we need a set of weights (W). In fact, we need 1,296 weights for each of the five font classes, which will give us our shape. Note that we also want to include an extra weight for each class as a bias (b). This is the same as adding an extra input variable that always takes the value 1:

# Variables
W = tf.Variable(tf.zeros([1296,5]))
b = tf.Variable(tf.zeros([5]))

With all these TensorFlow variables floating around, we need to make sure they get initialized. Let's call them now:

# Just initialize
sess.run(tf.global_variables_initializer())

Good job! You've got everything prepared. Now you can implement the softmax formula to compute probabilities. Because we set up our weights and input very carefully, TensorFlow makes this task very easy with just a call to tf.matmul and tf.nn.softmax:

# Define model
y = tf.nn.softmax(tf.matmul(x,W) + b)

That's it! You've implemented an entire machine learning classifier in TensorFlow. Nice work. But where do we get the values for the weights? Let's take a look at using TensorFlow to train the model.

 

Logistic regression training


First, you'll learn about the loss function for our machine learning classifier and implement it in TensorFlow. Then, we'll quickly train the model by evaluating the right TensorFlow node. Finally, we'll verify that our model is reasonably accurate and the weights make sense.

Developing the loss function

Optimizing our model really means minimizing how wrong we are. With our labels in one-hot style, it's easy to compare these with the class probabilities predicted by the model. The categorical cross_entropy function is a formal way to measure this. While the exact statistics are beyond the scope of this course, you can think of it as punishing the model for more for less accurate predictions. To compute it, we multiply our one-hot real labels element-wise with the natural log of the predicted probabilities, then sum these values and negate them. Conveniently, TensorFlow already includes this function as tf.nn.softmax_cross_entropy_with_logits() and we can just call that:

# Climb on cross-entropy
cross_entropy = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(
        logits = y + 1e-50, labels = y_))

Note that we are adding a small error value of 1e-50 here to avoid numerical instability problems.

Training the model

TensorFlow is convenient in that it provides built-in optimizers to take advantage of the loss function we just wrote. Gradient descent is a common choice and will slowly nudge our weights toward better results. This is the node that will update our weights:

# How we train
train_step = tf.train.GradientDescentOptimizer(
                0.02).minimize(cross_entropy)

Before we actually start training, we should specify a few more nodes to assess how well the model does:

# Define accuracy
correct_prediction = tf.equal(tf.argmax(y,1),
                     tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(
           correct_prediction, "float"))

The correct_prediction node is 1 if our model assigns the highest probability to the correct class, and 0 otherwise. The accuracy variable averages these predictions over the available data, giving us an overall sense of how well the model did.

When training in machine learning, we often want to use the same data point multiple times to squeeze all the information out. Each pass through the entire training data is called an epoch. Here, we're going to save both the training and validation accuracy every 10 epochs:

# Actually train
epochs = 1000
train_acc = np.zeros(epochs//10)
test_acc = np.zeros(epochs//10)
for i in tqdm(range(epochs)):
    # Record summary data, and the accuracy
    if i % 10 == 0:
        # Check accuracy on train set
        A = accuracy.eval(feed_dict={
            x: train.reshape([-1,1296]),
            y_: onehot_train})
        train_acc[i//10] = A
        # And now the validation set
        A = accuracy.eval(feed_dict={
            x: test.reshape([-1,1296]),
            y_: onehot_test})
        test_acc[i//10] = A
    train_step.run(feed_dict={
        x: train.reshape([-1,1296]),
        y_: onehot_train})

Note that we use feed_dict to pass in different types of data to get different output values. Finally, train_step.run updates the model every iteration. This should only take a few minutes on a typical computer, much less if you're using a GPU, and a bit more on an underpowered machine.

You just trained a model with TensorFlow; awesome!

Evaluating the model accuracy

After 1,000 epochs, let's take a look at the model. If you have Matplotlib installed, you can view the accuracies in a graphical plot; if not, you can still look at the number. For the final results, use the following code:

# Notice that accuracy flattens out
print(train_acc[-1])
print(test_acc[-1])

If you do have Matplotlib installed, you can use the following code to display the plot:

# Plot the accuracy curves
plt.figure(figsize=(6,6))
plt.plot(train_acc,'bo')
plt.plot(test_acc,'rx')

You should see something like the following plot (note that we used some random initialization, so it might not be exactly the same):

It seems like the validation accuracy flattens out after about 400-500 iterations; beyond this, our model may either be overfitting or not learning much more. Also, even though the final accuracy of about 40 percent might seem poor, recall that, with five classes, a totally random guess would only have 20 percent accuracy. With this limited dataset, the simple model is doing all it can.

It's also often helpful to look at computed weights. These can give you a clue as to what the model thinks is important. Let's plot them by pixel position for a given class:

# Look at a subplot of the weights for each font
f, plts = plt.subplots(5, sharex=True)
for i in range(5):
    plts[i].pcolor(W.eval()[:,i].reshape([36,36]))

This should give you a result similar to the following (again, if the plot comes out very wide, you can squeeze in the window size to square it up):

We can see that the weights near the interior are important in some models, while the weights on the outside are essentially zero. This makes sense, since none of the font characters reach the corners of the images.

Again, note that your final results might look a little different due to random initialization effects. Always feel free to experiment and change the parameters of the model; that's how you'll learn new things.

Summary

In this chapter, we installed TensorFlow on a machine we can use. After some small steps with basic computations, we jumped into a machine learning problem, successfully building a decent model with just logistic regression and a few lines of TensorFlow code.

In the next chapter, we'll see TensorFlow in its prime with deep neural networks.

About the Author
  • Dan Van Boxel

    Dan Van Boxel is a data scientist and machine learning engineer with over 10 years of experience. He is most well-known for Dan Does Data, a YouTube livestream demonstrating the power and pitfalls of neural networks. He has developed and applied novel statistical models of machine learning to topics such as accounting for truck traffic on highways, travel time outlier detection, and other areas. Dan has also published research articles and presented findings at the Transportation Research Board and other academic journals.

    Browse publications by this author
Latest Reviews (1 reviews total)
The book wasn't as comprehensive as I would have expected, but it still covered enough ground to be valuable. For the price paid it was very good.
Hands-On Deep Learning with TensorFlow
Unlock this book and the full library FREE for 7 days
Start now