Reader small image

You're reading from  Hands-On Machine Learning with C++

Product typeBook
Published inMay 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781789955330
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Kirill Kolodiazhnyi
Kirill Kolodiazhnyi
author image
Kirill Kolodiazhnyi

Kirill Kolodiazhnyi is a seasoned software engineer with expertise in custom software development. He has several years of experience building machine learning models and data products using C++. He holds a bachelor degree in Computer Science from the Kharkiv National University of Radio-Electronics. He currently works in Kharkiv, Ukraine where he lives with his wife and daughter.
Read more about Kirill Kolodiazhnyi

Right arrow

Sentiment Analysis with Recurrent Neural Networks

Currently, the recurrent neural network (RNN) is one of the most well-known and practical approaches used to construct deep neural networks. They are designed to process time-series data. Typically, data of this nature is found in the following tasks:

  • Natural language text processing, such as text analysis and automatic translation
  • Automatic speech recognition
  • Video processing, for predicting the next frame based on previous frames, and for recognizing emotions
  • Image processing, for generating image descriptions
  • Time series analysis, for predicting fluctuations in exchange rates or company stock prices

In recurrent networks, communications between elements form a directed sequence. Thanks to this, it becomes possible to process a time series of events or sequential spatial chains. Unlike multilayer perceptrons, recurrent networks...

Technical requirements

An overview of the RNN concept

The goal of an RNN is consistent data usage under the assumption that there is some dependency between consecutive data elements. In traditional neural networks, it is understood that all inputs and outputs are independent. But for many tasks, this independence is not suitable. If you want to predict the next word in a sentence, for example, knowing the sequence of words preceding it is the most reliable way to do so. RNNs are recurrent because they perform the same task for each element of the sequence, and the output is dependent on previous calculations.

In other words, RNNs are networks that have feedback loops and memory. RNNs use memory to take into account prior information and calculations results. The idea of a recurrent network can be represented as follows:

In the preceding diagram, a fragment of the neural network, (a layer of neurons...

Training RNNs using the concept of backpropagation through time

At the time of writing, for training neural networks nearly everywhere, the error backpropagation algorithm is used. The result of performing inference on the training set of examples (in our case, the set of subsequences) is checked against the expected result (labeled data). The difference between the actual and expected values ​​is called an error. This error is propagated to the network weights in the opposite direction. Thus, the network adapts to labeled data, and the result of this adaptation works well for the data that the network did not meet in the initial training examples (generalization hypothesis).

In the case of a recurrent network, we have several options regarding which network outputs we can consider the error. This section describes the two main approaches: the first considers the...

Exploring RNN architectures

In this section, we will have a look at various kinds of RNN architectures. We will also understand how they differ from each other based on their nature and implementations.

LSTM

Long short-term memory (LSTM) is a special kind of RNN architecture that's capable of learning long-term dependencies. It was introduced by Sepp Hochreiter and Jürgen Schmidhuber in 1997 and was then improved on and presented in the works of many other researchers. It perfectly solves many of the various problems we've discussed, and are now widely used.

In LSTM, each cell has a memory cell and three gates (filters): an input gate, an output gate, and a forgetting gate. The purpose of these gates is to...

Understanding natural language processing with RNNs

Natural language processing (NLP) is a subfield of computer science that studies algorithms for processing and analyzing human languages. There are a variety of algorithms and approaches for teaching computers to solve a task that assumes using human language data. Let's start with the basic principles used in this area. After all, the computer does not know how to read, so the first issue with NLP is that you have to teach a machine to work with natural language words. One idea that comes to mind is to encode words with numbers in the order they exist in the dictionary. This idea is fairly simple numbers are endless, and you can number and renumber words with ease. But this idea has a significant drawback; the words in the dictionary are in alphabetical order, and when we add new words, we need to renumber a lot...

Sentiment analysis example with an RNN

In this section, we are going to build a machine learning model that can detect review sentiment (detect whether a review is positive or negative) using PyTorch. As a training set, we are going to use the Large Movie Review Dataset, which contains a set of 25,000 movie reviews for training and 25,000 for testing, both of which are highly polarized.

First, we have to develop parser and data loader classes to move the dataset to memory in a format suitable for use with PyTorch.

Let's start with the parser. The dataset we have is organized as follows: there are two folders for the train and test sets, and each of these folders contains two child folders named pos and neg, which is where the positive review files and negative review files are placed. Each file in the dataset contains exactly one review, and its sentiment is determined by...

Summary

In this chapter, we learned the basic principles of RNNs. This type of neural network is commonly used in sequence analysis. The main differences between the feedforward neural network types are the existence of a recurrent link; the fact it is shared across timestep's weights; its ability to save some internal state in memory; and the fact it has a forward and backward data flow (bidirectional networks).

We became familiar with different types of RNNs and saw that the simplest one has problems with vanishing and exploding gradients, while the more advanced architectures can successfully deal with these problems. We learned the basics of the LSTM architecture, which is based on the hidden state, cell state, and three types of gates (filters), which control what information to use from the previous timestep, what information to forget, and what portion of information...

Further reading

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Machine Learning with C++
Published in: May 2020Publisher: PacktISBN-13: 9781789955330
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Kirill Kolodiazhnyi

Kirill Kolodiazhnyi is a seasoned software engineer with expertise in custom software development. He has several years of experience building machine learning models and data products using C++. He holds a bachelor degree in Computer Science from the Kharkiv National University of Radio-Electronics. He currently works in Kharkiv, Ukraine where he lives with his wife and daughter.
Read more about Kirill Kolodiazhnyi