Reader small image

You're reading from  Recurrent Neural Networks with Python Quick Start Guide

Product typeBook
Published inNov 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789132335
Edition1st Edition
Languages
Right arrow
Author (1)
Simeon Kostadinov
Simeon Kostadinov
author image
Simeon Kostadinov

Simeon Kostadinoff works for a startup called Speechify which aims to help people go through their readings faster by converting any text into speech. Simeon is Machine Learning enthusiast who writes a blog and works on various projects on the side. He enjoys reading different research papers and implement some of them in code. He was ranked number 1 in mathematics during his senior year of high school and thus he has deep passion about understanding how the deep learning models work under the hood. His specific knowledge in Recurrent Neural Networks comes from several courses that he has taken at Stanford University and University of Birmingham. They helped in understanding how to apply his theoretical knowledge into practice and build powerful models. In addition, he recently became a Stanford Scholar Initiative which includes working in a team of Machine Learning researchers on a specific deep learning research paper.
Read more about Simeon Kostadinov

Right arrow

Building Your Personal Assistant

In this chapter, we will focus our full attention on the practical side of recurrent neural networks when building a conversational chatbot. Using your most recent knowledge on sequence models, you will create an end-to-end model that aims to yield meaningful results. You will make use of a high-level TensorFlow-based library, called TensorLayer. This library makes it easier to create simple prototypes of complicated systems such as that of a chatbot. The main topics that will be covered are the following:

  • What are we building?:This is a more detailed introduction to the exact problem and its solution
  • Preparing the dataAs always, any deep learning model requires this step, so it is crucial to mention it here
  • Creating the chatbot network: You will learn how to use TensorLayer to build the graph for the sequence-to-sequence model used for...

What are we building?

The focus of this chapter is to walk you through building a simple conversational chatbot that is able to give answers to a set of different questions. Recently, chatbots have become more and more popular, and we can see them in numerous practical applications.

Some areas where you can see the use of this software include the following:

  • Communication between clients and businesses, where the chatbot assists users in finding what they need, or provides support if something does not work properly. For example, Facebook offers a really handy way of implementing a chatbot for your business
  • The personal assistant behind voice control systems such as Amazon Alexa, Apple Siri, and more: You have a full end-to-end human-like conversation where you can set reminders, order products, and more

Our simple example will present a slightly augmented version of the...

Preparing the data

In this section, we will focus on how our data (tweets, in this case) is transformed to fit the model's requirements. We will first see how, using the files in the data/ folder from the GitHub repo for this task, the model can help us extract the needed tweets. Then, we will look at how, with the help of a simple set of functions, we can split and transform the data to achieve the needed results. 

An important file to examine is data.py, inside the data/twitter folder. It transforms plain text into a numeric format so it is easy for us to train the network. We won't go deep into the implementation, since you can examine it by yourself. After running the code, we produce three important files:

  • idx_q.npy: This is an array of arrays containing index representation of all the words in different sentences forming the chatbot questions...

Creating the chatbot network

This section is one of the most important, so you need to make sure you understand it quite well in order to grasp the full concept of our application. We will be introducing the network graph that will be used for training and prediction. 

But first, let's define the hyperparameters of the model. These are predefined constants that play a significant role in determining how well the model performs. As you will learn in the next chapter, our main task is to tweak the hyperparameters' values until we're satisfied with the model's prediction. In this case, an initial set of hyperparameters is selected. Of course, for better performance, one needs to do some optimization on them. This chapter won't focus on this part but I highly recommend doing it using techniques from the last chapter of this book (Chapter 6Improving...

Training the chatbot

Once we have defined the model graph, we want to train it using our input data. Then, we will have a well-tuned set of parameters that can be used for accurate predictions. 

First, we specify the TensorFlow's Session object that encapsulates the environment in which Operation (summation, subtraction, and so on) objects are executed and Tensor (placeholders, variables, and so on) objects are evaluated:

sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=False))
sess.run(tf.global_variables_initializer())

A good explanation of the config parameter can be found at https://stackoverflow.com/questions/44873273/what-do-the-options-in-configproto-like-allow-soft-placement-and-log-device-plac. In summary, once we specify allow_soft_placement, the operations will be executed on the CPU only if there is no GPU registered...

Building a conversation

This step is really similar to the training one. The first difference is that we don't make any evaluation of our predictions, but instead use the input to generate the results. The second difference is that we use the already trained set of variables to yield this result. You will see how it is done later in this chapter. 

To make things clearer, we first initialize a new sequence-to-sequence model. Its purpose is to use the already trained weights and biases and make predictions based on different sets of inputs. We only have an encoder and decoder sequence, where the encoder one is an input sentence and the decoder sequence is fed one word at a time. We define the new model as follows:

encode_seqs2 = tf.placeholder(dtype=tf.int64, shape=[1, None], name="encode_seqs")
decode_seqs2 = tf.placeholder(dtype=tf.int64, shape=[1, None], name...

Summary

This chapter reveals a full implementation of a chatbot system that manages to construct a short conversation. The prototype shows, in detail, each stage of building the intelligent chatbot. This includes collecting data, training the network, and making predictions (generating conversation). 

For the network's architecture, we use the powerful encoder-decoder sequence-to-sequence model that utilizes two recurrent neural networks, while connecting them using an encoder vector. For the actual implementation, we make use of a deep learning library built on top of TensorFlow, called TensorLayer. It simplifies most of the work by introducing simple one-line implementations of standard models such as sequence-to sequence. In addition, this library is useful for preprocessing your data before using it for training.

The next chapter shifts focus to, probably, the...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Recurrent Neural Networks with Python Quick Start Guide
Published in: Nov 2018Publisher: PacktISBN-13: 9781789132335
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Simeon Kostadinov

Simeon Kostadinoff works for a startup called Speechify which aims to help people go through their readings faster by converting any text into speech. Simeon is Machine Learning enthusiast who writes a blog and works on various projects on the side. He enjoys reading different research papers and implement some of them in code. He was ranked number 1 in mathematics during his senior year of high school and thus he has deep passion about understanding how the deep learning models work under the hood. His specific knowledge in Recurrent Neural Networks comes from several courses that he has taken at Stanford University and University of Birmingham. They helped in understanding how to apply his theoretical knowledge into practice and build powerful models. In addition, he recently became a Stanford Scholar Initiative which includes working in a team of Machine Learning researchers on a specific deep learning research paper.
Read more about Simeon Kostadinov