Reader small image

You're reading from  Hands-On Neural Network Programming with C#

Product typeBook
Published inSep 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789612011
Edition1st Edition
Languages
Right arrow
Author (1)
Matt Cole
Matt Cole
author image
Matt Cole

Matt R. Cole is a developer and author with 30 years' experience. Matt is the owner of Evolved AI Solutions, a provider of advanced Machine Learning/Bio-AI, Microservice and Swarm technologies. Matt is recognized as a leader in Microservice and Artificial Intelligence development and design. As an early pioneer of VOIP, Matt developed the VOIP system for NASA for the International Space Station and Space Shuttle. Matt also developed the first Bio Artificial Intelligence framework which completely integrates mirror and canonical neurons. In his spare time Matt authors books, and continues his education taking every available course in advanced mathematics, AI/ML/DL, Quantum Mechanics/Physics, String Theory and Computational Neuroscience.
Read more about Matt Cole

Right arrow

GRUs Compared to LSTMs, RNNs, and Feedforward networks

In this chapter, we're going to talk about gated recurrent units (GRU). We will also compare them to LSTMs, which we learned about in the previous chapter. As you know, LSTMs have been around since 1987 and are among the most widely used models in Deep Learning for NLP today. GRUs, however, were first presented in 2014, are a simpler variant of LSTMs that share many of the same properties, train easier and faster, and typically have less computational complexity.

In this chapter, we will learn about the following:

  • GRUs
  • How GRUs differ from LSTMs
  • How to implement a GRU
  • GRU, LTSM, RNN, and Feedforward comparisons
  • Network differences

Technical requirements

You will be required to have a basic knowledge of .NET development using Microsoft Visual Studio and C#. You will need to download the code for this chapter from the book website.

Check out the following video to see Code in Action: http://bit.ly/2OHd7o5.

QuickNN

To follow along with the code, you should have the QuickNN solution open inside Microsoft Visual Studio. We will be using this code to explain in detail some of the finer points as well as comparisons between coding the different networks. Here is the solution you should have loaded:

Solution

Understanding GRUs

GRUs are a cousin to the long short-term memory recurrent neural networks. Both LSTM and GRU networks have additional parameters that control when and how their internal memory is updated. Both can capture long- and short-term dependencies in sequences. The GRU networks, however, involve less parameters than their LSTM cousins, and as a result, are faster to train. The GRU learns how to use its reset and forget gates in order to make longer term predictions while enforcing memory protection. Let's look at a simple diagram of a GRU:

GRU

Differences between LSTM and GRU

There are a few subtle differences between a LSTM and a GRU, although to be perfectly honest, there are more similarities than differences! For starters, a GRU has one less gate than an LSTM. As you can see in the following diagram, an LSTM has an input gate, a forget gate, and an output gate. A GRU, on the other hand, has only two gates, a reset gate and an update gate. The reset gate determines how to combine new inputs with the previous memory, and the update gate defines how much of the previous memory remains:

LSTM vs GRU

Another interesting fact is that if we set the reset gate to all 1s and the update gate to all 0s, do you know what we have? If you guessed a plain old recurrent neural network, you'd be right!

Here are the key differences between a LSTM and a GRU:

  • A GRU has two gates, a LSTM has three.
  • GRUs do not have an internal...

Coding different networks

In this section, we are going to look at the sample code we described earlier in this chapter. We specifically are going to look at how we build different networks. The NetworkBuilder is our main object for building the four different types of networks we need for this exercise. You can feel free to modify it and add additional networks if you so desire. Currently, it supports the following networks:

  • LSTM
  • RNN
  • GRU
  • Feedforward

The one thing that you will notice in our sample network is that the only difference between networks is how the network itself is created via the NetworkBuilder. All the remaining code stays the same. You will also note if you look through the example source code that the number of iterations or epochs is much lower in the GRU sample. This is because GRUs are typically easier to train and therefore require fewer iterations. While...

Comparing LSTM, GRU, Feedforward, and RNN operations

In order to help you see the difference in both the creation and results of all the network objects we have been dealing with, I created the sample code that follows. This sample will allow you to see the difference in training times for all four of the network types we have here. As stated previously, the GRU is the easiest to train and therefore will complete faster (in less iterations) than the other networks. When executing the code, you will see that the GRU achieves the optimal error rate typically in under 10,000 iterations, while a conventional RNN and/or LSTM can take 50,000 or more iterations to converge properly.

Here is what our sample code looks like:

static void Main(string[] args)
{
Console.WriteLine("Running GRU sample", Color.Yellow);
Console.ReadKey();
ExampleGRU.Run();
Console.ReadKey();
Console.WriteLine...

Network differences

As mentioned earlier, the only difference between our networks are the layers that are created and added to the network object. In an LSTM we will add LSTM layers, and in a GRU, unsurprisingly, we will add GRU layers, and so forth. All four types of creation functions are displayed as follows for you to compare:

public static NeuralNetwork MakeLstm(int inputDimension, int hiddenDimension, int hiddenLayers, int outputDimension, INonlinearity decoderUnit, double initParamsStdDev, Random rng)
{
List<ILayer> layers = new List<ILayer>();
for (int h = 0; h<hiddenLayers; h++)
{
layers.Add(h == 0
? new LstmLayer(inputDimension, hiddenDimension, initParamsStdDev, rng)
: new LstmLayer(hiddenDimension, hiddenDimension, initParamsStdDev, rng));
}
layers.Add(new FeedForwardLayer(hiddenDimension, outputDimension, decoderUnit...

Summary

In this chapter, we learned about GRUs. We showed how they compared to, and differed from, LSTM Networks. We also showed you an example program that tested all the network types we discussed and produced their outputs. We also compared how these networks are created.

I hope you enjoyed your journey with me throughout this book. As we as authors try and better understand what readers would like to see and hear, I welcome your constructive comments and feedback, which will only help to make the book and source code better. Till the next book, happy coding!

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Neural Network Programming with C#
Published in: Sep 2018Publisher: PacktISBN-13: 9781789612011
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Matt Cole

Matt R. Cole is a developer and author with 30 years' experience. Matt is the owner of Evolved AI Solutions, a provider of advanced Machine Learning/Bio-AI, Microservice and Swarm technologies. Matt is recognized as a leader in Microservice and Artificial Intelligence development and design. As an early pioneer of VOIP, Matt developed the VOIP system for NASA for the International Space Station and Space Shuttle. Matt also developed the first Bio Artificial Intelligence framework which completely integrates mirror and canonical neurons. In his spare time Matt authors books, and continues his education taking every available course in advanced mathematics, AI/ML/DL, Quantum Mechanics/Physics, String Theory and Computational Neuroscience.
Read more about Matt Cole