Reader small image

You're reading from  Advanced Deep Learning with R

Product typeBook
Published inDec 2019
Reading LevelExpert
PublisherPackt
ISBN-139781789538779
Edition1st Edition
Languages
Right arrow
Author (1)
Bharatendra Rai
Bharatendra Rai
author image
Bharatendra Rai

Bharatendra Rai is a chairperson and professor of business analytics, and the director of the Master of Science in Technology Management program at the Charlton College of Business at UMass Dartmouth. He received a Ph.D. in industrial engineering from Wayne State University, Detroit. He received a master's in quality, reliability, and OR from Indian Statistical Institute, India. His current research interests include machine learning and deep learning applications. His deep learning lecture videos on YouTube are watched in over 198 countries. He has over 20 years of consulting and training experience in industries such as software, automotive, electronics, food, chemicals, and so on, in the areas of data science, machine learning, and supply chain management.
Read more about Bharatendra Rai

Right arrow

Text Classification Using Recurrent Neural Networks

Recurrent neural networks are useful for solving problems where data involves sequences. Some examples of applications involving sequences are seen in text classification, time series prediction, the sequence of frames in videos, DNA sequences, and speech recognition.

In this chapter, we will develop a sentiment (positive or negative) classification model using a recurrent neural network. We will begin by preparing the data for developing the text classification model, followed by developing a sequential model, compiling the model, fitting the model, evaluating the model, prediction, and model performance assessment using a confusion matrix. We will also review some tips for sentiment classification performance optimization.

More specifically, in this chapter, we will cover the following topics:

  • Preparing data for model building...

Preparing data for model building

In this chapter, we'll be using the Internet Movie Database (IMDb) movie reviews text data that's available in the Keras package. Note that there is no need to download this data from anywhere as it can be easily accessed from the Keras library using code that we will discuss soon. In addition, this dataset is preprocessed so that text data is converted into a sequence of integers. We cannot use text data directly for model building, and such preprocessing of text data into a sequence of integers is necessary before the data can be used as input for developing deep learning networks.

We will start by loading the imdb data using the dataset_imdb function, where we will also specify the number of most frequent words as 500 using num_words. Then, we'll split the imdb data into train and test datasets. Let's take a look at the...

Developing a recurrent neural network model

In this section, we will develop the architecture for the recurrent neural network and compile it. Let's look at the following code:

# Model architecture
model <- keras_model_sequential()
model %>%
layer_embedding(input_dim = 500, output_dim = 32) %>%
layer_simple_rnn(units = 8) %>%
layer_dense(units = 1, activation = "sigmoid")

We start by initializing the model using the keras_model_sequential function. Then, we add embedding and simple recurrent neural network (RNN) layers. For the embedding layer, we specify input_dim to be 500, which is the same as the number of most frequent words that we had specified earlier. The next layer is a simple RNN layer, with the number of hidden units specified as 8.

Note that the default activation function for the layer_simple_rnn layer is...

Fitting the model

The code for fitting the model is as follows:

# Fit model
model_one <- model %>% fit(train_x, train_y,
epochs = 10,
batch_size = 128,
validation_split = 0.2)

For fitting the model, we will make use of a 20% validation split, which uses 20,000 movie review data from training data for building the model. The remaining 5,000 movie review training data is used for assessing validation in the form of loss and accuracy. We run 10 epochs with a batch size of 128.

When using a validation split, it is important to note that, with 20%, it uses the first 80% of the training data for training and the last 20% of the training data for validation. Thus, if the first 50% of the review data was negative and the last 50% was positive, the 20% validation split will cause model validation to be based only on positive reviews. Therefore, before using...

Model evaluation and prediction

First, we will evaluate the model based on the train data for loss and accuracy. We will also obtain a confusion matrix based on the train data. The same process shall be repeated with the test data.

Training the data

We will use the evaluate function to obtain the loss and accuracy values, as shown in the following code:

# Loss and accuracy
model %>% evaluate(train_x, train_y)
$loss
[1] 0.4057531

$acc
[1] 0.8206

As seen from the preceding output, the loss and accuracy values based on the training data are 0.406 and 0.821, respectively.

Predictions using training data are used for developing a confusion matrix, as shown in the following code:

# Prediction and confusion matrix
pred <- model ...

Performance optimization tips and best practices

When developing a recurrent neural network model, we come across situations where we need to make several decisions related to the network. These decisions could include trying a different activation function rather than the default one that we had used. Let's make such changes and see what impact they have on the movie review sentiment classification performance of the model.

In this section, we will experiment with the following four factors:

  • Number of units in the simple RNN layer
  • Using different activation functions in the simple RNN layer
  • Adding more recurrent layers
  • Changes in the maximum length for padding sequences

Number of units in the simple RNN layer

The...

Summary

In this chapter, we illustrated the use of the recurrent neural network model for text sentiment classification using IMDb movie review data. Compared to a regular densely connected network, recurrent neural networks are better suited to deal with data that has sequences in it. Text data is one such example that we worked with in this chapter.

In general, deep networks involve many factors or variables, and this calls for some amount of experimentation involving making changes to the levels for such factors before arriving at a useful model. In this chapter, we also developed five different movie review sentiment classification models.

A variant of recurrent neural networks that has become popular is Long Short-Term Memory (LSTM) networks. LSTM networks are capable of learning long-term dependencies and help recurrent networks remember inputs for a longer time.

In the...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Advanced Deep Learning with R
Published in: Dec 2019Publisher: PacktISBN-13: 9781789538779
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Bharatendra Rai

Bharatendra Rai is a chairperson and professor of business analytics, and the director of the Master of Science in Technology Management program at the Charlton College of Business at UMass Dartmouth. He received a Ph.D. in industrial engineering from Wayne State University, Detroit. He received a master's in quality, reliability, and OR from Indian Statistical Institute, India. His current research interests include machine learning and deep learning applications. His deep learning lecture videos on YouTube are watched in over 198 countries. He has over 20 years of consulting and training experience in industries such as software, automotive, electronics, food, chemicals, and so on, in the areas of data science, machine learning, and supply chain management.
Read more about Bharatendra Rai