You're reading from Hands-On Neural Network Programming with C#

Product typeBook

Published inSep 2018

Reading LevelIntermediate

PublisherPackt

ISBN-139781789612011

Edition1st Edition

Languages

Concepts

Neural Networks

Author (1)

Matt Cole

GRUs Compared to LSTMs, RNNs, and Feedforward networks

In this chapter, we're going to talk about gated recurrent units (GRU). We will also compare them to LSTMs, which we learned about in the previous chapter. As you know, LSTMs have been around since 1987 and are among the most widely used models in Deep Learning for NLP today. GRUs, however, were first presented in 2014, are a simpler variant of LSTMs that share many of the same properties, train easier and faster, and typically have less computational complexity.

In this chapter, we will learn about the following:

GRUs
How GRUs differ from LSTMs
How to implement a GRU
GRU, LTSM, RNN, and Feedforward comparisons
Network differences

Technical requirements

You will be required to have a basic knowledge of .NET development using Microsoft Visual Studio and C#. You will need to download the code for this chapter from the book website.

Check out the following video to see Code in Action: http://bit.ly/2OHd7o5.

QuickNN

To follow along with the code, you should have the QuickNN solution open inside Microsoft Visual Studio. We will be using this code to explain in detail some of the finer points as well as comparisons between coding the different networks. Here is the solution you should have loaded:

Solution

Understanding GRUs

GRUs are a cousin to the long short-term memory recurrent neural networks. Both LSTM and GRU networks have additional parameters that control when and how their internal memory is updated. Both can capture long- and short-term dependencies in sequences. The GRU networks, however, involve less parameters than their LSTM cousins, and as a result, are faster to train. The GRU learns how to use its reset and forget gates in order to make longer term predictions while enforcing memory protection. Let's look at a simple diagram of a GRU:

GRU

Differences between LSTM and GRU

There are a few subtle differences between a LSTM and a GRU, although to be perfectly honest, there are more similarities than differences! For starters, a GRU has one less gate than an LSTM. As you can see in the following diagram, an LSTM has an input gate, a forget gate, and an output gate. A GRU, on the other hand, has only two gates, a reset gate and an update gate. The reset gate determines how to combine new inputs with the previous memory, and the update gate defines how much of the previous memory remains:

LSTM vs GRU

Another interesting fact is that if we set the reset gate to all 1s and the update gate to all 0s, do you know what we have? If you guessed a plain old recurrent neural network, you'd be right!

Here are the key differences between a LSTM and a GRU:

A GRU has two gates, a LSTM has three.
GRUs do not have an internal...

Coding different networks

In this section, we are going to look at the sample code we described earlier in this chapter. We specifically are going to look at how we build different networks. The NetworkBuilder is our main object for building the four different types of networks we need for this exercise. You can feel free to modify it and add additional networks if you so desire. Currently, it supports the following networks:

LSTM
RNN
GRU
Feedforward

The one thing that you will notice in our sample network is that the only difference between networks is how the network itself is created via the NetworkBuilder. All the remaining code stays the same. You will also note if you look through the example source code that the number of iterations or epochs is much lower in the GRU sample. This is because GRUs are typically easier to train and therefore require fewer iterations. While...

Comparing LSTM, GRU, Feedforward, and RNN operations

In order to help you see the difference in both the creation and results of all the network objects we have been dealing with, I created the sample code that follows. This sample will allow you to see the difference in training times for all four of the network types we have here. As stated previously, the GRU is the easiest to train and therefore will complete faster (in less iterations) than the other networks. When executing the code, you will see that the GRU achieves the optimal error rate typically in under 10,000 iterations, while a conventional RNN and/or LSTM can take 50,000 or more iterations to converge properly.

Here is what our sample code looks like:

static void Main(string[] args)
{
Console.WriteLine("Running GRU sample", Color.Yellow);
Console.ReadKey();
ExampleGRU.Run();
Console.ReadKey();
Console.WriteLine...

Network differences

As mentioned earlier, the only difference between our networks are the layers that are created and added to the network object. In an LSTM we will add LSTM layers, and in a GRU, unsurprisingly, we will add GRU layers, and so forth. All four types of creation functions are displayed as follows for you to compare:

public static NeuralNetwork MakeLstm(int inputDimension, int hiddenDimension, int hiddenLayers, int outputDimension, INonlinearity decoderUnit, double initParamsStdDev, Random rng)
{
    List<ILayer> layers = new List<ILayer>();
    for (int h = 0; h<hiddenLayers; h++)
    {
        layers.Add(h == 0
         ? new LstmLayer(inputDimension, hiddenDimension, initParamsStdDev, rng)
         : new LstmLayer(hiddenDimension, hiddenDimension, initParamsStdDev, rng));
    }
    layers.Add(new FeedForwardLayer(hiddenDimension, outputDimension, decoderUnit...

Summary

In this chapter, we learned about GRUs. We showed how they compared to, and differed from, LSTM Networks. We also showed you an example program that tested all the network types we discussed and produced their outputs. We also compared how these networks are created.

I hope you enjoyed your journey with me throughout this book. As we as authors try and better understand what readers would like to see and hear, I welcome your constructive comments and feedback, which will only help to make the book and source code better. Till the next book, happy coding!

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Neural Network Programming with C#

Published in: Sep 2018Publisher: PacktISBN-13: 9781789612011

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Matt Cole

Matt R. Cole is a developer and author with 30 years' experience. Matt is the owner of Evolved AI Solutions, a provider of advanced Machine Learning/Bio-AI, Microservice and Swarm technologies. Matt is recognized as a leader in Microservice and Artificial Intelligence development and design. As an early pioneer of VOIP, Matt developed the VOIP system for NASA for the International Space Station and Space Shuttle. Matt also developed the first Bio Artificial Intelligence framework which completely integrates mirror and canonical neurons. In his spare time Matt authors books, and continues his education taking every available course in advanced mathematics, AI/ML/DL, Quantum Mechanics/Physics, String Theory and Computational Neuroscience.
Read more about Matt Cole

Other recommended products

Related to this chapter

Learn ARCore - Fundamentals of Google ARCore

Are you a mobile developer or a web developer who is looking to create immersive and cool Augmented Reality apps with the latest Google ARCore platform? This book will help you to jump right into developing with ARCore and help you create a step by step AR app with it easily. This book will teach you to implement the core features of ARCore starting from the fundamentals of 3D rendering to more advanced concepts like lighting, shaders, Machine Learning and more.

BookMar 2018274 pages

Hands-On Genetic Algorithms with Python

Using this book, you will gain expertise in genetic algorithms, understand how they work and know when and how to use them to create intelligent Python-based applications. By the end of this book, you will have hands-on experience applying genetic algorithms to artificial intelligence as well as numerous other domains.

BookJan 2020346 pages

Neural Network Programming with Java

BookMar 2017270 pages

Deep Learning with Hadoop

BookFeb 2017206 pages

Hands-On Deep Learning Architectures with Python

This book explains the essential learning algorithms used for deep and shallow architectures. Packed with practical implementations to help you understand the concepts and ideas required to build efficient artificial intelligence systems, this book will help you construct deep models using popular frameworks and datasets.

BookApr 2019316 pages

Deep Learning with PyTorch Quick Start Guide

PyTorch is extremely powerful and yet easy to learn. It provides advanced features such as supporting multiprocessor, distributed and parallel computation. This book is an excellent entry point for those wanting to explore deep learning with PyTorch to harness its power.

BookDec 2018158 pages

Hands-On Mathematics for Deep Learning

The main aim of this book is to make the advanced mathematical background accessible to someone with a programming background. This book will equip the readers with not only deep learning architectures but the mathematics behind them. With this book, you will understand the relevant mathematics that goes behind building deep learning models.

BookJun 2020364 pages

Hands-On Neural Networks

This book will be a journey for beginners who want to step into the world of deep learning and artificial intelligence. It will thoughtfully take you through the training and implementation of various neural network architectures using the Python ecosystem. You will master each neural network architecture while understanding its working mechanism.

BookMay 2019280 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages