You're reading from Hands-On Natural Language Processing with PyTorch 1.x

Product type Book

Published in Jul 2020

Publisher Packt

ISBN-13 9781789802740

Pages 276 pages

Edition 1st Edition

Languages

Python

Concepts

Mobile Application Development

Author (1):

Thomas Dop

Table of Contents (14) Chapters

Preface

Section 1: Essentials of PyTorch 1.x for NLP

Chapter 1: Fundamentals of Machine Learning and Deep Learning

Chapter 2: Getting Started with PyTorch 1.x for NLP

Section 2: Fundamentals of Natural Language Processing

Chapter 3: NLP and Text Embeddings

Chapter 4: Text Preprocessing, Stemming, and Lemmatization

Section 3: Real-World NLP Applications Using PyTorch 1.x

Chapter 5: Recurrent Neural Networks and Sentiment Analysis

Chapter 6: Convolutional Neural Networks for Text Classification

Chapter 7: Text Translation Using Sequence-to-Sequence Neural Networks

Chapter 8: Building a Chatbot Using Attention-Based Neural Networks

Chapter 9: The Road Ahead

Other Books You May Enjoy

Leave a review - let other readers know what you think

Chapter 5: Recurrent Neural Networks and Sentiment Analysis

In this chapter, we will look at Recurrent Neural Networks (RNNs), a variation of the basic feed forward neural networks in PyTorch that we learned how to build in Chapter 1, Fundamentals of Machine Learning. Generally, RNNs can be used for any task where data can be represented as a sequence. This includes things such as stock price prediction, using a time series of historic data represented as a sequence. We commonly use RNNs in NLP as text can be thought of as a sequence of individual words and can be modeled as such. While a conventional neural network takes a single vector as input to the model, an RNN can take a whole sequence of vectors. If we represent each word in a document as a vector embedding, we can represent a whole document as a sequence of vectors (or an order 3 tensor). We can then use RNNs (and a more sophisticated form of RNN known as Long Short-Term Memory (LSTM) to learn from our data.

In this chapter...

Technical requirements

All the code used in this chapter can be found at https://github.com/PacktPublishing/Hands-On-Natural-Language-Processing-with-PyTorch-1.x/tree/master/Chapter5. Heroku can be installed from www.heroku.com. The data was taken from https://archive.ics.uci.edu/ml/datasets/Sentiment+Labelled+Sentences.

Building RNNs

RNNs consist of recurrent layers. While they are similar in many ways to the fully connected layers within a standard feed forward neural network, these recurrent layers consist of a hidden state that is updated at each step of the sequential input. This means that for any given sequence, the model is initialized with a hidden state, often represented as a one-dimensional vector. The first step of our sequence is then fed into our model and the hidden state is updated depending on some learned parameters. The second word is then fed into the network and the hidden state is updated again depending on some other learned parameters. These steps are repeated until the whole sequence has been processed and we are left with the final hidden state. This computation loop, with the hidden state carried over from the previous computation and updated, is why we refer to these networks as recurrent. This final hidden state is then connected to a further fully connected layer and...

Introducing LSTMs

While RNNs allow us to use sequences of words as input to our models, they are far from perfect. RNNs suffer from two main flaws, which can be partially remedied by using a more sophisticated version of the RNN, known as LSTM.

The basic structure of RNNs means that it is very difficult for them to retain information long term. Consider a sentence that's 20 words long. From our first word in the sentence affecting the initial hidden state to the last word in the sentence, our hidden state is updated 20 times. From the beginning of our sentence to our final hidden state, it is very difficult for an RNN to retain information about words at the beginning of the sentence. This means that RNNs aren't very good at capturing long-term dependencies within sequences. This also ties in with the vanishing gradient problem mentioned earlier, where it is very inefficient to backpropagate through long, sparse sequences of vectors.

Consider a long paragraph where...

Building a sentiment analyzer using LSTMs

We will now look at how to build our own simple LSTM to categorize sentences based on their sentiment. We will train our model on a dataset of 3,000 reviews that have been categorized as positive or negative. These reviews come from three different sources—film reviews, product reviews, and location reviews—in order to ensure that our sentiment analyzer is robust. The dataset is balanced so that it consists of 1,500 positive reviews and 1,500 negative reviews. We will start by importing our dataset and examining it:

with open("sentiment labelled sentences/sentiment.txt") as f:
    reviews = f.read()
    
data = pd.DataFrame([review.split('\t') for review in                      reviews.split('\n')])
data.columns = ['Review','Sentiment']...

Deploying the application on Heroku

We have now trained our model on our local machine and we can use this to make predictions. However, this isn't necessarily any good if you want other people to be able to use your model to make predictions. If we host our model on a cloud-based platform, such as Heroku, and create a basic API, other people will be able to make calls to the API to make predictions using our model.

Introducing Heroku

Heroku is a cloud-based platform where you can host your own basic programs. While the free tier of Heroku has a maximum upload size of 500 MB and limited processing power, this should be sufficient for us to host our model and create a basic API in order to make predictions using our model.

The first step is to create a free account on Heroku and install the Heroku app. Then, in the command line, type the following command:

heroku login

Summary

In this chapter, we discussed the fundamentals of RNNs and one of their main variations, LSTM. We then demonstrated how you can build your own RNN from scratch and deploy it on the cloud-based platform Heroku. While RNNs are often used for deep learning on NLP tasks, they are by no means the only neural network architecture suitable for this task.

In the next chapter, we will look at convolutional neural networks and show how they can be used for NLP learning tasks.