Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Hands-On Natural Language Processing with PyTorch 1.x

You're reading from  Hands-On Natural Language Processing with PyTorch 1.x

Product type Book
Published in Jul 2020
Publisher Packt
ISBN-13 9781789802740
Pages 276 pages
Edition 1st Edition
Languages
Author (1):
Thomas Dop Thomas Dop
Profile icon Thomas Dop

Table of Contents (14) Chapters

Preface Section 1: Essentials of PyTorch 1.x for NLP
Chapter 1: Fundamentals of Machine Learning and Deep Learning Chapter 2: Getting Started with PyTorch 1.x for NLP Section 2: Fundamentals of Natural Language Processing
Chapter 3: NLP and Text Embeddings Chapter 4: Text Preprocessing, Stemming, and Lemmatization Section 3: Real-World NLP Applications Using PyTorch 1.x
Chapter 5: Recurrent Neural Networks and Sentiment Analysis Chapter 6: Convolutional Neural Networks for Text Classification Chapter 7: Text Translation Using Sequence-to-Sequence Neural Networks Chapter 8: Building a Chatbot Using Attention-Based Neural Networks Chapter 9: The Road Ahead Other Books You May Enjoy

Chapter 5: Recurrent Neural Networks and Sentiment Analysis

In this chapter, we will look at Recurrent Neural Networks (RNNs), a variation of the basic feed forward neural networks in PyTorch that we learned how to build in Chapter 1, Fundamentals of Machine Learning. Generally, RNNs can be used for any task where data can be represented as a sequence. This includes things such as stock price prediction, using a time series of historic data represented as a sequence. We commonly use RNNs in NLP as text can be thought of as a sequence of individual words and can be modeled as such. While a conventional neural network takes a single vector as input to the model, an RNN can take a whole sequence of vectors. If we represent each word in a document as a vector embedding, we can represent a whole document as a sequence of vectors (or an order 3 tensor). We can then use RNNs (and a more sophisticated form of RNN known as Long Short-Term Memory (LSTM) to learn from our data.

In this chapter...

Technical requirements

Building RNNs

RNNs consist of recurrent layers. While they are similar in many ways to the fully connected layers within a standard feed forward neural network, these recurrent layers consist of a hidden state that is updated at each step of the sequential input. This means that for any given sequence, the model is initialized with a hidden state, often represented as a one-dimensional vector. The first step of our sequence is then fed into our model and the hidden state is updated depending on some learned parameters. The second word is then fed into the network and the hidden state is updated again depending on some other learned parameters. These steps are repeated until the whole sequence has been processed and we are left with the final hidden state. This computation loop, with the hidden state carried over from the previous computation and updated, is why we refer to these networks as recurrent. This final hidden state is then connected to a further fully connected layer and...

Introducing LSTMs

While RNNs allow us to use sequences of words as input to our models, they are far from perfect. RNNs suffer from two main flaws, which can be partially remedied by using a more sophisticated version of the RNN, known as LSTM.

The basic structure of RNNs means that it is very difficult for them to retain information long term. Consider a sentence that's 20 words long. From our first word in the sentence affecting the initial hidden state to the last word in the sentence, our hidden state is updated 20 times. From the beginning of our sentence to our final hidden state, it is very difficult for an RNN to retain information about words at the beginning of the sentence. This means that RNNs aren't very good at capturing long-term dependencies within sequences. This also ties in with the vanishing gradient problem mentioned earlier, where it is very inefficient to backpropagate through long, sparse sequences of vectors.

Consider a long paragraph where...

Building a sentiment analyzer using LSTMs

We will now look at how to build our own simple LSTM to categorize sentences based on their sentiment. We will train our model on a dataset of 3,000 reviews that have been categorized as positive or negative. These reviews come from three different sources—film reviews, product reviews, and location reviews—in order to ensure that our sentiment analyzer is robust. The dataset is balanced so that it consists of 1,500 positive reviews and 1,500 negative reviews. We will start by importing our dataset and examining it:

with open("sentiment labelled sentences/sentiment.txt") as f:
    reviews = f.read()
    
data = pd.DataFrame([review.split('\t') for review in                      reviews.split('\n')])
data.columns = ['Review','Sentiment']...

Deploying the application on Heroku

We have now trained our model on our local machine and we can use this to make predictions. However, this isn't necessarily any good if you want other people to be able to use your model to make predictions. If we host our model on a cloud-based platform, such as Heroku, and create a basic API, other people will be able to make calls to the API to make predictions using our model.

Introducing Heroku

Heroku is a cloud-based platform where you can host your own basic programs. While the free tier of Heroku has a maximum upload size of 500 MB and limited processing power, this should be sufficient for us to host our model and create a basic API in order to make predictions using our model.

The first step is to create a free account on Heroku and install the Heroku app. Then, in the command line, type the following command:

heroku login

Log in using your account details. Then, create a new heroku project by typing the following command...

Summary

In this chapter, we discussed the fundamentals of RNNs and one of their main variations, LSTM. We then demonstrated how you can build your own RNN from scratch and deploy it on the cloud-based platform Heroku. While RNNs are often used for deep learning on NLP tasks, they are by no means the only neural network architecture suitable for this task.

In the next chapter, we will look at convolutional neural networks and show how they can be used for NLP learning tasks.

lock icon The rest of the chapter is locked
You have been reading a chapter from
Hands-On Natural Language Processing with PyTorch 1.x
Published in: Jul 2020 Publisher: Packt ISBN-13: 9781789802740
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}