You're reading from Advanced Deep Learning with R

Product typeBook

Published inDec 2019

Reading LevelExpert

PublisherPackt

ISBN-139781789538779

Edition1st Edition

Languages

Tools

H2O TensorFlow

Concepts

Deep Learning

Author (1)

Bharatendra Rai

Text Classification Using Convolutional Recurrent Neural Networks

Convolutional neural networks (CNNs) have been found to be useful in capturing high-level local features from data. On the other hand, recurrent neural networks (RNNs), such as long short-term memory (LSTM), have been found to be useful in capturing long-term dependencies in data involving sequences such as text. When we use CNNs and RNNs in the same model architecture, it gives rise to what's called convolutional recurrent neural networks (CRNNs).

This chapter illustrates how to apply convolutional recurrent neural networks to text classification problems by combining the advantages of RNNs and CNNs networks. The steps that are involved in this process include text data preparation, defining a convolutional recurrent network model, training the model, and model assessment.

More specifically, in this chapter...

Working with the reuter_50_50 dataset

In the previous chapters, when dealing with text data, we made use of data that had already been converted into a sequence of integers for developing deep network models. In this chapter, we will use text data that needs to be converted into a sequence of integers. We will start by reading the data that we will use to illustrate how to develop a text classification deep network model. We will also explore the dataset that we'll use so that we have a better understanding of it.

In this chapter, we will make use of the keras, deepviz, and readtext libraries, as shown in the following code:

# Libraries used
library(keras)
library(deepviz)
library(readtext)

For illustrating the steps involved in developing a convolutional recurrent network model, we will make use of the reuter_50_50 text dataset, which is available from the UCI Machine Learning...

Preparing the data for model building

In this section, we will prepare some data so that we can develop an author classification model. We will start by using tokens to convert text data that is available in the form of articles into a sequence of integers. We will also make changes to identify each author by unique integers. Subsequently, we will use padding and truncation to arrive at the same length for the sequence of integers that represent the articles by 50 authors. We will end this section by partitioning the training data into train and validation datasets and then carrying out one-hot encoding on the response variables.

Tokenization and converting text into a sequence of integers

We will start by carrying out tokenization...

Developing the model architecture

In this section, we will make use of convolutional and LSTM layers in the same network. The convolutional recurrent network architecture can be captured in the form of a simple flowchart:

Here, we can see that the flowchart contains embedding, convolutional 1D, maximum pooling, LSTM, and dense layers. Note that the embedding layer is always the first layer in the network and is commonly used for applications involving text data. The main purpose of the embedding layer is to find a mapping of each unique word, which in our example is 500, and turn it into a vector that is smaller in size, which we will specify using output_dim. In the convolutional layer, we will use the relu activation function. Similarly, the activation functions that will be used for the LSTM and dense layers will be tanh and softmax, respectively.

We can use the following...

Compiling and fitting the model

In this section, we will compile the model and then train the model using the fit function using the training and validation dataset. We will also plot the loss and accuracy values that were obtained while training the model.

Compiling the model

For compiling the model, we will use the following code:

# Compile model
model %>% compile(optimizer = "adam",  
         loss = "categorical_crossentropy",
         metrics = c("acc"))

Here, we've specified the adam optimizer. We're using categorical_crossentropy as the loss function since the labels are based on 50 authors. For the metrics, we've specified the accuracy of the author's classification...

Evaluating the model and predicting classes

In this section, we will evaluate the model based on our training and test data. We will obtain accuracy by correctly classifying each author using a confusion matrix for the training and test data to gain further insights. We will also use bar plots to visualize the accuracy of identifying each author.

Model evaluation with training data

First, we will evaluate the model's performance using training data. Then, we will use the model to predict the class representing each of the 50 authors. The code for evaluating the model is as follows:

# Loss and accuracy
model %>% evaluate(trainx, trainy)
$loss
[1] 1.45669
$acc
[1] 0.5346288

Here, we can see that, by using the training data...

Performance optimization tips and best practices

In this section, we will explore changes we can make to the model architecture and other settings to improve author classification performance. We will carry out two experiments, and, for both of these two experiments, we will increase the number of most frequent words from 500 to 1,500 and increase the length of the sequences of integers from 300 to 400. For both experiments, we will also add a dropout layer after the pooling layer.

Experimenting with reduced batch size

The code that we'll be using for this experiment is as follows:

# Model architecture
model <- keras_model_sequential() %>%
         layer_embedding(input_dim = 1500, 
                         output_dim...

Summary

In this chapter, we illustrated the steps for developing a convolutional recurrent neural network for author classification based on articles that they have written. Convolutional recurrent neural networks combine the advantages of two networks into one network. On one hand, convolutional networks can capture high-level local features from the data, while, on the other hand, recurrent networks can capture long-term dependencies in the data involving sequences.

First, convolutional recurrent neural networks extract features using a one-dimensional convolutional layer. These extracted features are then passed to the LSTM recurrent layer to obtain hidden long-term dependencies, which are then passed to a fully connected dense layer. This dense layer obtains the probability of the correct classification of each author based on the data in the articles. Although we used a convolutional...

The rest of the chapter is locked

You have been reading a chapter from

Advanced Deep Learning with R

Published in: Dec 2019Publisher: PacktISBN-13: 9781789538779

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Bharatendra Rai

Bharatendra Rai is a chairperson and professor of business analytics, and the director of the Master of Science in Technology Management program at the Charlton College of Business at UMass Dartmouth. He received a Ph.D. in industrial engineering from Wayne State University, Detroit. He received a master's in quality, reliability, and OR from Indian Statistical Institute, India. His current research interests include machine learning and deep learning applications. His deep learning lecture videos on YouTube are watched in over 198 countries. He has over 20 years of consulting and training experience in industries such as software, automotive, electronics, food, chemicals, and so on, in the areas of data science, machine learning, and supply chain management.
Read more about Bharatendra Rai

Other recommended products

Related to this chapter

Machine Learning for Healthcare Analytics Projects

Machine Learning in the healthcare domain is booming because of its abilities to provide accurate and stabilized techniques. This book is packed with new methodologies to create efficient solutions for healthcare analytics. We will build five end-to-end projects to evaluate the efficiency of AI apps to carry out simple-to-complex healthcare analytics tasks.

BookOct 2018134 pages

Deep Learning with R Cookbook

This book will help you get through the problems that you face during the execution of different tasks and understand hacks in deep learning. With unique recipes, you will implement various deep learning architectures using R 3.5.x. You will cover complex algorithms to perform tasks such as reinforcement learning, GANs, advanced neural networks and more.

BookFeb 2020328 pages

Generative Adversarial Networks Projects

In this book, we will use different complexities of datasets in order to build end-to-end projects. With every chapter, the level of complexity and operations will become advanced. It consists of 8 full-fledged projects covering approaches such as 3D-GAN, Age-cGAN, DCGAN, SRGAN, StackGAN, and CycleGAN with real-world use cases.

BookJan 2019316 pages

Neural Networks with Keras Cookbook

This book presents solutions to the majority of the challenges you will face while training neural networks to solve deep learning problems. It covers the trending deep learning architectures used in industry and tackles a variety of use cases in computer vision, text processing, audio analysis, recommender systems, and game bots

BookFeb 2019568 pages

Keras 2.x Projects

Keras is a deep learning library that enables the fast, efficient training of deep learning models. The book begins with setting up the environment, training various types of models in the domain of deep learning and reinforcement learning. The projects are exciting and are real-world market demanding projects which take you from simple to complex level.

BookDec 2018394 pages

R Deep Learning Cookbook

Deep Learning is the next big thing. It is a part of machine learning. Its favorable results in application with huge and complex data is remarkable. This book will help you to get through the problems that you face during the execution of different tasks and understand hacks in deep learning, neural networks, and advanced machine learning techniques

BookAug 2017288 pages

The Deep Learning with Keras Workshop

Cut through the noise and get real results with a step-by-step approach to understanding deep learning with Keras programming

BookFeb 2020446 pages

The Deep Learning with Keras Workshop

The Deep Learning with Keras Workshop outlines a simple and straightforward way for you to understand deep learning with Keras. Starting with basic concepts such as data preprocessing, this book equips you with all the tools and techniques required for training your neural networks to solve various modeling problems.

BookJul 2020496 pages1

R Deep Learning Essentials

This book demonstrates how to use deep Learning in R for machine learning, image classification, and natural language processing. It covers topics such as convolutional networks, recurrent neural networks, transfer learning and deep learning in the cloud. By the end of this book, you will be able to apply deep learning to real-world projects.

BookAug 2018378 pages

R Deep Learning Projects

R is a popular programming language used by statisticians and mathematicians for statistical analysis, and is popularly used for deep learning. This book demonstrates end-to-end implementations of five real-world projects on popular topics in deep learning such as handwritten digit recognition, traffic light detection, fraud detection, text generation, and sentiment analysis. You'll see how to train effective neural networks in R—including convolutional neural networks, recurrent neural networks and LSTMs—and also see how neural networks can be trained using GPU capabilities. You will use popular R libraries and packages—such as MXNetR, H2O, deepnet, and more—to implement the projects. By the end of this book, you will have a better understanding of deep learning concepts and techniques and how to use them in a practical setting.

BookFeb 2018258 pages

Applied Deep Learning with PyTorch

Starting with the basics of deep learning and their various applications, Applied Deep Learning with PyTorch shows you how to solve trending tasks, such as image classification and natural language processing by understanding the different architectures of the neural networks.

BookApr 2019254 pages

Keras Deep Learning Cookbook

This book gives you a practical, hands-on understanding of how you can leverage the power of Python and Keras to perform effective deep learning. It presents a unique problem-solution approach to tackle various problems in training different types of neural networks while taking care of the speed and accuracy of these models

BookOct 2018252 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages