You're reading from Advanced Deep Learning with Python

Product typeBook

Published inDec 2019

Reading LevelIntermediate

PublisherPackt

ISBN-139781789956177

Edition1st Edition

Languages

Python

Tools

PyTorch TensorFlow

Concepts

Deep Learning

Author (1)

Ivan Vasilev

Understanding Recurrent Networks

In Chapter 1, The Nuts and Bolts of Neural Networks, and Chapter 2, Understanding Convolutional Networks, we took an in-depth look at the properties of general feedforward networks and their specialized incarnation, Convolutional Neural Networks (CNNs). In this chapter, we'll close this story arc with Recurrent Neural Networks (RNNs). The NN architectures we discussed in the previous chapters take in a fixed-sized input and provide a fixed-sized output. RNNs lift this constraint with their ability to process input sequences of a variable length by defining a recurrent relationship over these sequences (hence the name). If you are familiar with some of the topics that will be discussed in this chapter, you can skip them.

In this chapter, we will cover the following topics:

Introduction to RNNs
Introducing long short-term memory
Introducing...

Introduction to RNNs

RNNs are neural networks that can process sequential data with a variable length. Examples of such data include the words of a sentence or the price of stock at various moments in time. By using the word sequential, we imply that the elements of the sequence are related to each other and that their order matters. For example, if we take a book and randomly shuffle all of the words in it, the text will lose its meaning, even though we'll still know the individual words. Naturally, we can use RNNs to solve tasks that relate to sequential data. Examples of such tasks are language translation, speech recognition, predicting the next element of a time series, and so on.

RNNs get their name because they apply the same function over a sequence recurrently. We can define an RNN as a recurrence relation:

Here, f is a differentiable function, s_t is a vector of...

Introducing long short-term memory

Hochreiter and Schmidhuber studied the problems of vanishing and exploding gradients extensively and came up with a solution called Long Short-Term Memory (LSTM, https://www.bioinf.jku.at/publications/older/2604.pdf). LSTMs can handle long-term dependencies due to a specially crafted memory cell. In fact, they work so well that most of the current accomplishments in training RNNs on a variety of problems are due to the use of LSTMs. In this section, we'll explore how this memory cell works and how it solves the vanishing gradients issue.

The key idea of LSTM is the cell state, c_t (in addition to the hidden RNN state, h_t), where the information can only be explicitly written in or removed so that the state stays constant if there is no outside interference. The cell state can only be modified by specific gates, which are a way to let information...

Introducing gated recurrent units

A Gated Recurrent Unit (GRU) is a type of recurrent block that was introduced in 2014 (Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, https://arxiv.org/abs/1406.1078 and Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, https://arxiv.org/abs/1412.3555) as an improvement over LSTM. A GRU unit usually has similar or better performance than an LSTM, but it does so with fewer parameters and operations:

A GRU cell

Similar to the classic RNN, a GRU cell has a single hidden state, h_t. You can think of it as a combination of the hidden and cell states of an LSTM. The GRU cell has two gates:

An update gate, z_t, which combines the input and forget LSTM gates. It decides what information to discard and what new information to include in its place, based on the network input, x_t...

Implementing text classification

Let's recap on this chapter so far. We started by implementing an RNN using only numpy. Then, we continued with an LSTM implementation using primitive PyTorch operations. We'll conclude this arc by training the default PyTorch 1.3.1 LSTM implementation for a text classification problem. This example also requires the torchtext 0.4.0 package. Text classification (or categorization) refers to the task of assigning categories (or labels) depending on its contents. Text classification tasks include spam detection, topic labeling, and sentiment analysis. This type of problem is an example of a many-to-one relationship, which we defined in the Introduction to RNNs section.

In this section, we'll implement a sentiment analysis example over the Large Movie Review Dataset (http://ai.stanford.edu/~amaas/data/sentiment/), which consists of...

Summary

In this chapter, we discussed RNNs. First, we started with the RNN and backpropagation through time theory. Then, we implemented an RNN from scratch to solidify our knowledge on the subject. Next, we moved on to more complex LSTM and GRU cells using the same pattern: a theoretical explanation, followed by a practical PyTorch implementation. Finally, we combined our knowledge from Chapter 6, Language Modeling, with the new material from this chapter for a full-featured sentiment analysis task implementation.

In the next chapter, we'll discuss seq2seq models and their variations—an exciting new development in sequence processing.

The rest of the chapter is locked

You have been reading a chapter from

Advanced Deep Learning with Python

Published in: Dec 2019Publisher: PacktISBN-13: 9781789956177

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Ivan Vasilev

Ivan Vasilev started working on the first open source Java deep learning library with GPU support in 2013. The library was acquired by a German company, with whom he continued its development. He has also worked as a machine learning engineer and researcher in medical image classification and segmentation with deep neural networks. Since 2017, he has focused on financial machine learning. He co-founded an algorithmic trading company, where he's the lead engineer. He holds an MSc in artificial intelligence from Sofia University St. Kliment Ohridski and has written two previous books on the same topic.
Read more about Ivan Vasilev

Other recommended products

Related to this chapter

Python Deep Learning

The book will help you learn deep neural networks and their applications in computer vision, generative models, and natural language processing. It will also introduce you to the area of reinforcement learning, where you’ll learn the state-of-the-art algorithms to teach the machines how to play games like Go and Atari.

BookJan 2019386 pages

Hands-On Meta Learning with Python

This hands-on guide for meta learning starts with exploring the principles, algorithms, and implementations of Meta learning with Tensorflow, Keras, and Python. Once it sets the foundation of "learning to learn", the book will help you implement your meta learning algorithms from scratch.

BookDec 2018226 pages

Hands-On One-shot Learning with Python

This book is a step by step guide to one-shot learning using Python-based libraries. It is designed to help you understand and design models that can learn information about your data from one, or only a few, training examples. You will also learn to apply these techniques with real-world examples and datasets for classification and regression.

BookApr 2020156 pages

Getting Started with Google BERT

Getting Started with Google BERT will help you become well-versed with the BERT model from scratch and learn how to create interesting NLP applications. You'll understand several variants of BERT such as ALBERT, RoBERTa, DistilBERT, ELECTRA, VideoBERT, and many others in detail.

BookJan 2021352 pages

Generative Adversarial Networks Projects

In this book, we will use different complexities of datasets in order to build end-to-end projects. With every chapter, the level of complexity and operations will become advanced. It consists of 8 full-fledged projects covering approaches such as 3D-GAN, Age-cGAN, DCGAN, SRGAN, StackGAN, and CycleGAN with real-world use cases.

BookJan 2019316 pages

Keras Deep Learning Cookbook

This book gives you a practical, hands-on understanding of how you can leverage the power of Python and Keras to perform effective deep learning. It presents a unique problem-solution approach to tackle various problems in training different types of neural networks while taking care of the speed and accuracy of these models

BookOct 2018252 pages

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

Mastering Transformers

Explore the accurate and fast fine-tuning capabilities of transformer-based language models and understand how they outperform traditional machine learning-based approaches when solving challenging NLU problems. Developers working with the Transformers architecture will be able to put their knowledge to work with this practical guide.

BookSep 2021374 pages

Hands-On Natural Language Processing with PyTorch 1.x

Developers working with NLP will be able to put their knowledge to work with this practical guide to PyTorch. You will learn to use PyTorch offerings and how to understand and analyze text using Python. You will learn to extract the underlying meaning in the text using deep neural networks and modern deep learning algorithms.

BookJul 2020276 pages

Advanced Deep Learning with TensorFlow 2 and Keras

A second edition of the bestselling guide to exploring and mastering deep learning with Keras, updated to include TensorFlow 2.x with new chapters on object detection, semantic segmentation, and unsupervised learning using mutual information.

BookFeb 2020512 pages

Practical Convolutional Neural Networks

This book helps you master CNN, from the basics to the most advanced concepts in CNN such as GANs, instance classification and attention mechanism for vision models and more. You will implement advanced CNN models using complex image and video datasets. By the end of the book you will learn CNN’s best practices to implement smart ConvNet models and apply them to solve complex deep learning problems.

BookFeb 2018218 pages

Hands-On Image Generation with TensorFlow

This book is a step-by-step guide to show you how to implement generative models in TensorFlow 2.x from scratch. You’ll get to grips with the image generative technology by covering autoencoders, style transfer, and GANs as well as fundamental and state-of-the-art models.

BookDec 2020306 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages