You're reading from TensorFlow 2.0 Quick Start Guide

Product typeBook

Published inMar 2019

Reading LevelBeginner

PublisherPackt

ISBN-139781789530759

Edition1st Edition

Languages

Python

Tools

TensorFlow

Concepts

Machine Learning

Author (1)

Tony Holdroyd

Recurrent Neural Networks Using TensorFlow 2

One of the main drawbacks with a number of neural network architectures, including ConvNets (CNNs), is that they do not allow for sequential data to be processed. In other words, a complete feature, for example, an image, has to be presented all at once. So the input is a fixed length tensor, and the output has to be a fixed length tensor. Neither do the output values of previous features affect the current feature in any way. Also, all of the input values (and output values) are taken to be independent of one another. For example, in our fashion_mnist model (Chapter 4, Supervised Machine Learning Using TensorFlow 2), each input fashion image is independent of, and totally ignorant of, previous images.

Recurrent Neural Networks (RNNs) overcome this problem and make a wide range of new applications possible.

In this chapter, we will...

Neural network processing modes

The following diagram illustrates the variety of neural network processing modes:

Rectangles represent tensors, arrows represent functions, red is input, blue is output, and green is the tensor state.

From left to right, we have the following:

Plain feed-forward network, fixed-size input, and fixed-size output, for example, image classification
Sequence output, for example, image captioning that takes one image and outputs a set of words identifying items in the image
Sequence input, for example, sentiment identification (like our IMDb application) where a sentence is classed as being of positive or negative sentiment
Both sequence input and output, for example, machine translation where an RNN takes an English sentence and translates it into a French output
Synced sequence both input and output, for example, video classification that is like...

Recurrent architectures

Hence, a new architecture is required for handling data that arrives sequentially, and where both or either of its input values and output values are of variable length for example, the words in a sentence in a language translation application. In this case, both the input and output to the model are of varying lengths as in the fourth mode previously. Also, in order to predict subsequent words given the current word, previous words need to be known as well. This new neural network architecture is called an RNN, and it is specifically designed to handle sequential data.

The term recurrent arises because such models perform the same computation on every element of a sequence, where each output is dependent on previous output. Theoretically, each output depends on all of the previous output items, but in practical terms, RNNs are limited to looking back just...

An application of RNNs

In this application, we will see how to create text using a character-based recurrent neural network. It is easy to change the corpus of text that to be used (see the example to follow); here, we will use the novel Great Expectations by Charles Dickens. We will train the network on this text so that, if we give it a character sequence such as thousan, it will produce the next character in the sequence, d. This process can be continued, and longer sequences of text created by calling the model repeatedly on the evolving sequence.

Here is an example of the text created before the model is trained:

Input: 
 'o else is there to inform?”\n\n“Is there no chance person who might identify you in the street?” said\n'
Next Char Predictions: 
 "dUFdZ!mig())'(ZIon“4g&HZ”@\nWGWtlinnqQY*dGJ7ioU'6(vLKL&...

The code for our RNN example

This application is based on one provided by Google under an Apache 2 license.

As usual, we will break the code down into snippets and refer you to the repository for the license and the full working version. Firstly, we have module imports, as follows:

import tensorflow as tf
import numpy as np
import os
import time

Next, we have the download link for the text file.

You can easily change this to any text you wish by specifying the file name in file and the full URL of the file in url:

file='1400-0.txt'
url='https://www.gutenberg.org/files/1400/1400-0.txt' # Great Expectations by Charles Dickens

And then we set up the Keras get_file() utility for that file, shown as follows:

path = tf.keras.utils.get_file(file,url)

Then, we open and read the file and see how long it is, in characters:

text = open(path).read()
print ('Length of text...

Building and instantiating our model

As we have seen previously, one technique for building a model is to pass the required layers into the tf.keras.Sequential() constructor. In this instance, we have three layers: an embedding layer, an RNN layer, and a dense layer.

The first, embedding layer is a lookup table of vectors, one vector for the numeric value of each character. It has the dimension, embedding_dimension. The middle, the recurrent layer is a GRU; its size is recurrent_nn_units. The last layer is a dense output layer of the length vocabulary_length units.

What the model does is look up the embedding, run the GRU for a single time step using the embedding for input, and pass this to the dense layer, which generates logits (log odds) for the next character.

A diagram showing this is as follows:

The code that implements this model is, therefore, as follows:

def build_model...

Using our model to get predictions

To get the predictions from our model, we need to take a sample from the output distribution. This sampling will get us the characters we need from that output distribution (sampling the output distribution is important because taking the argmax of it, as we would normally do, can easily get the model stuck in a loop).

tf.random.categorical does this sampling and tf.squeeze with axis=-1 removes the last dimension of the tensor, prior to displaying the indices.

The signature of tf.random.categorical is as follows:

tf.random.categorical(logits, num_samples, seed=None, name=None, output_dtype=None)

Comparing this with the call, we see that we are taking one sample (of length sequence_length = 100) from the predictions (example_batch_predictions[0]). The extra dimension is then removed, so we can look up the characters corresponding to the sample...

Summary

This concludes our look at RNNs. In this chapter, we first discussed the general principles of RNNs, and then saw how to acquire and prepare some text for use by a model, noting that it is straightforward to use an alternative source of text here. We then saw how to create and instantiate our model. We then trained our model and used it produce text from our starting string, noting that the network has learned that words are units of text and how to spell quite a variety of words, somewhat in the style of the author of the text, with only a couple of non-words.

In the next chapter, we will look at the use of TensorFlow Hub, which is a software library.

The rest of the chapter is locked

You have been reading a chapter from

TensorFlow 2.0 Quick Start Guide

Published in: Mar 2019Publisher: PacktISBN-13: 9781789530759

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Tony Holdroyd

Tony Holdroyd's first degree, from Durham University, was in maths and physics. He also has technical qualifications, including MCSD, MCSD.net, and SCJP. He holds an MSc in computer science from London University. He was a senior lecturer in computer science and maths in further education, designing and delivering programming courses in many languages, including C, C+, Java, C#, and SQL. His passion for neural networks stems from research he did for his MSc thesis. He has developed numerous machine learning, neural network, and deep learning applications, and has advised in the media industry on deep learning as applied to image and music processing. Tony lives in Gravesend, Kent, UK, with his wife, Sue McCreeth, who is a renowned musician.
Read more about Tony Holdroyd

Other recommended products

Related to this chapter

What's New in TensorFlow 2.0

This book will cover all the new features that have been introduced in TensorFlow 2.0 especially the major highlight, including eager execution and more. You will learn how to make the best use of these features to migrate your codes from TensorFlow 1.x to TensorFlow 2.0 in a seamless way.

BookAug 2019202 pages

TensorFlow 2.0 Computer Vision Cookbook

This book covers recipes for solving various computer vision tasks using TensorFlow, taking you through all the tips and tricks you need to overcome any challenges that you may face while building various computer vision applications. You will discover machine learning techniques to solve problems in image processing, feature extraction, and more.

BookFeb 2021542 pages

Learn TensorFlow Enterprise

This book is a comprehensive introduction for those who are new to scalable and optimized TensorFlow for production. You will learn how to deliver enterprise-grade support for your existing and newly built AI applications. You will address the various needs of AI-enabled organizations to manage and scale machine learning workloads in production.

BookNov 2020314 pages

Hands-On Deep Learning for Images with TensorFlow

In this book, you will come across various real-world projects which will teach you how to leverage Tensforflow’s capabilities to perform efficient image processing tasks. By the end of this book, you will have mastered all the concepts of deep learning for Images and their implementations with Tensorflow and Keras

BookJul 201896 pages

Hands-On Neural Networks with TensorFlow 2.0

This book is a guide to the TensorFlow (TF) framework, from the static graph architecture of TF 1.x to the eager execution and all the new features introduced in TF 2.0. Neural Networks applications are developed throughout the book with the aim of making the reader capable of developing neural networks-based solutions to real problems using TF 2.0

BookSep 2019358 pages

Python Deep Learning Cookbook

Deep Learning is a rapidly evolving field of Machine Learning science which gives machines the ability to learn from information. This book contains detailed recipes to tackle with the common and not so common problems while dealing with deep learning algorithms and models in Python. You will benefit from this book by finding technical solutions to the issues presented, along with a detailed explanation of the solutions, and a discussion on corresponding pros and cons of implementing the proposed solution using Theano, Tensorflow, MXNet, and Keras. You'll come across recipes on data pre-processing, network models and topologies, supervised and unsupervised learning presented in a “solution to problem” fashion.

BookOct 2017330 pages

Machine Learning Using TensorFlow Cookbook

This book is designed to guide you through TensorFlow and how to use it effectively. Throughout the book, you will work through recipes and get hands-on experience to perform complex data computations, gain insights into your data, and more.

BookFeb 2021416 pages

Keras Deep Learning Cookbook

This book gives you a practical, hands-on understanding of how you can leverage the power of Python and Keras to perform effective deep learning. It presents a unique problem-solution approach to tackle various problems in training different types of neural networks while taking care of the speed and accuracy of these models

BookOct 2018252 pages

Mastering PyTorch

Discover the flexibility of the PyTorch library for implementing new algorithms in a scalable and efficient way with this expert guide. This book will show you how to process data with deep learning methodologies using PyTorch 1.x and cover advanced topics such as GANs, Deep RL, and NLP using advanced deep learning techniques.

BookFeb 2021450 pages

Mastering TensorFlow 1.x

We cover advanced deep learning concepts (such as transfer learning, generative adversarial models, and reinforcement learning), and implement them using TensorFlow and Keras. We cover how to build and deploy at scale with distributed models. You will learn to build TensorFlow models using R, Keras, TensorFlow Learn, TensorFlow Slim and Sonnet

BookJan 2018474 pages

Deep Learning with Keras

Keras is a high-level neural network library written in Python that runs on top of either Theano or TensorFlow. With this book, you’ll learn the basics of Keras in a highly practical way and understand how this minimal, highly modular framework runs on both CPU and GPU, allowing you to put your ideas into action in the shortest possible time.

BookApr 2017318 pages

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages