Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Python Machine Learning - Third Edition

You're reading from  Python Machine Learning - Third Edition

Product type Book
Published in Dec 2019
Publisher Packt
ISBN-13 9781789955750
Pages 772 pages
Edition 3rd Edition
Languages
Authors (2):
Sebastian Raschka Sebastian Raschka
Profile icon Sebastian Raschka
Vahid Mirjalili Vahid Mirjalili
Profile icon Vahid Mirjalili
View More author details

Table of Contents (21) Chapters

Preface 1. Giving Computers the Ability to Learn from Data 2. Training Simple Machine Learning Algorithms for Classification 3. A Tour of Machine Learning Classifiers Using scikit-learn 4. Building Good Training Datasets – Data Preprocessing 5. Compressing Data via Dimensionality Reduction 6. Learning Best Practices for Model Evaluation and Hyperparameter Tuning 7. Combining Different Models for Ensemble Learning 8. Applying Machine Learning to Sentiment Analysis 9. Embedding a Machine Learning Model into a Web Application 10. Predicting Continuous Target Variables with Regression Analysis 11. Working with Unlabeled Data – Clustering Analysis 12. Implementing a Multilayer Artificial Neural Network from Scratch 13. Parallelizing Neural Network Training with TensorFlow 14. Going Deeper – The Mechanics of TensorFlow 15. Classifying Images with Deep Convolutional Neural Networks 16. Modeling Sequential Data Using Recurrent Neural Networks 17. Generative Adversarial Networks for Synthesizing New Data 18. Reinforcement Learning for Decision Making in Complex Environments 19. Other Books You May Enjoy 20. Index

Modeling Sequential Data Using Recurrent Neural Networks

In the previous chapter, we focused on convolutional neural networks (CNNs). We covered the building blocks of CNN architectures and how to implement deep CNNs in TensorFlow. Finally, you learned how to use CNNs for image classification. In this chapter, we will explore recurrent neural networks (RNNs) and see their application in modeling sequential data.

We will cover the following topics:

  • Introducing sequential data
  • RNNs for modeling sequences
  • Long short-term memory (LSTM)
  • Truncated backpropagation through time (TBPTT)
  • Implementing a multilayer RNN for sequence modeling in TensorFlow
  • Project one: RNN sentiment analysis of the IMDb movie review dataset
  • Project two: RNN character-level language modeling with LSTM cells, using text data from Jules Verne's The Mysterious Island
  • Using gradient clipping to avoid exploding gradients
  • Introducing the Transformer model and understanding...

Introducing sequential data

Let's begin our discussion of RNNs by looking at the nature of sequential data, which is more commonly known as sequence data or sequences. We will take a look at the unique properties of sequences that make them different to other kinds of data. We will then see how we can represent sequential data and explore the various categories of models for sequential data, which are based on the input and output of a model. This will help us to explore the relationship between RNNs and sequences in this chapter.

Modeling sequential data – order matters

What makes sequences unique, compared to other types of data, is that elements in a sequence appear in a certain order and are not independent of each other. Typical machine learning algorithms for supervised learning assume that the input is independent and identically distributed (IID) data, which means that the training examples are mutually independent and have the same underlying distribution...

RNNs for modeling sequences

In this section, before we start implementing RNNs in TensorFlow, we will discuss the main concepts of RNNs. We will begin by looking at the typical structure of an RNN, which includes a recursive component to model sequence data. Then, we will examine how the neuron activations are computed in a typical RNN. This will create a context for us to discuss the common challenges in training RNNs, and we will then discuss solutions to these challenges, such as LSTM and gated recurrent units (GRUs).

Understanding the RNN looping mechanism

Let's start with the architecture of an RNN. The following figure shows a standard feedforward NN and an RNN side by side for comparison:

Both of these networks have only one hidden layer. In this representation, the units are not displayed, but we assume that the input layer (x), hidden layer (h), and output layer (o) are vectors that contain many units.

Determining the type of output from an...

Implementing RNNs for sequence modeling in TensorFlow

Now that we have covered the underlying theory behind RNNs, we are ready to move on to the more practical portion of this chapter: implementing RNNs in TensorFlow. During the rest of this chapter, we will apply RNNs to two common problem tasks:

  1. Sentiment analysis
  2. Language modeling

These two projects, which we will walk through together in the following pages, are both fascinating but also quite involved. Thus, instead of providing the code all at once, we will break the implementation up into several steps and discuss the code in detail. If you like to have a big picture overview and want to see all the code at once before diving into the discussion, take a look at the code implementation first, which you can view at https://github.com/rasbt/python-machine-learning-book-3rd-edition/tree/master/ch16.

Project one – predicting the sentiment of IMDb movie reviews

You may recall from Chapter 8, Applying...

Understanding language with the Transformer model

In this chapter, we solved two sequence modeling problems using RNN-based NNs. However, a new architecture has recently emerged that has been shown to outperform the RNN-based seq2seq models in several NLP tasks.

It is called the Transformer architecture, capable of modeling global dependencies between input and output sequences, and was introduced in 2017 by Ashish Vaswani, et. al., in the NeurIPS paper Attention Is All You Need (available online at http://papers.nips.cc/paper/7181-attention-is-all-you-need). The Transformer architecture is based on a concept called attention, and more specifically, the self-attention mechanism. Let's consider the sentiment analysis task that we covered earlier in this chapter. In this case, using the attention mechanism would mean that our model would be able to learn to focus on the parts of an input sequence that are more relevant to the sentiment.

Understanding the self-attention mechanism...

Summary

In this chapter, you first learned about the properties of sequences that make them different to other types of data, such as structured data or images. We then covered the foundations of RNNs for sequence modeling. You learned how a basic RNN model works and discussed its limitations with regard to capturing long-term dependencies in sequence data. Next, we covered LSTM cells, which consist of a gating mechanism to reduce the effect of exploding and vanishing gradient problems, which are common in basic RNN models.

After discussing the main concepts behind RNNs, we implemented several RNN models with different recurrent layers using the Keras API. In particular, we implemented an RNN model for sentiment analysis, as well as an RNN model for generating text. Finally, we covered the Transformer model, which leverages the self-attention mechanism in order to focus on the relevant parts of a sequence.

In the next chapter, you will learn about generative models and, in particular...

lock icon The rest of the chapter is locked
You have been reading a chapter from
Python Machine Learning - Third Edition
Published in: Dec 2019 Publisher: Packt ISBN-13: 9781789955750
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime}