Reader small image

You're reading from  Hands-On Natural Language Processing with PyTorch 1.x

Product typeBook
Published inJul 2020
Reading LevelBeginner
PublisherPackt
ISBN-139781789802740
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Thomas Dop
Thomas Dop
author image
Thomas Dop

Thomas Dop is a data scientist at MagicLab, a company that creates leading dating apps, including Bumble and Badoo. He works on a variety of areas within data science, including NLP, deep learning, computer vision, and predictive modeling. He holds an MSc in data science from the University of Amsterdam.
Read more about Thomas Dop

Right arrow

Chapter 6: Convolutional Neural Networks for Text Classification

In the previous chapter, we showed how RNNs can be used to provide sentiment classifications for text. However, RNNs are not the only neural network architecture that can be used for NLP classification tasks. Convolutional neural networks (CNNs) are another such architecture.

RNNs rely on sequential modeling, maintain a hidden state, and then step sequentially through the text word by word, updating the state at each iteration. CNNs do not rely on the sequential element of language, but instead try and learn from the text by perceiving each word in the sentence individually and learning its relationship to the words surrounding it within the sentence.

While CNNs are more commonly used for classifying images for the reasons mentioned here, they have been shown to be effective at classifying text as well. While we do perceive text as a sequence, we also know that the meaning of individual words in the sentence depends...

Technical requirements

Exploring CNNs

The basis for CNNs comes from the field of computer vision but can conceptually be extended to work on NLP as well. The way the human brain processes and understands images is not on a pixel-by-pixel basis, but as a holistic map of an image and how each part of the image relates to the other parts.

A good analogy of CNNs would be how the human mind processes a picture versus how it processes a sentence. Consider the sentence, This is a sentence about a cat. When you read that sentence you read the first word, followed by the second word and so forth. Now, consider a picture of a cat. It would be foolish to assimilate the information within the picture by looking at the first pixel, followed by the second pixel. Instead, when we look at something, we perceive the whole image at once, rather than as a sequence.

For example, if we take a black and white representation of an image (in this case, the digit 1), we can see that we can transform this into a vector...

Building a CNN for text classification

Now that we know the basics of CNNs, we can begin to build one from scratch. In the previous chapter, we built a model for sentiment prediction, where sentiment was a binary classifier; 1 for positive and 0 for negative. However, in this example, we will aim to build a CNN for multi-class text classification. In a multi-class problem, a particular example can only be classified as one of several classes. If an example can be classified as many different classes, then this is multi-label classification. Since our model is multi-class, this means that our model will aim at predicting which one of several classes our input sentence is classified as. While this problem is considerably more difficult than our binary classification task (as our sentence can now belong to one of many, rather than one of two classes), we will show that CNNs can deliver good performance on this task. We will first begin by defining our data.

Defining a multi-class...

Summary

In this chapter, we have shown how CNNs can be used to learn from NLP data and how to train one from scratch using PyTorch. While the deep learning methodology is very different to the methodology used within RNNs, conceptually, CNNs use the motivation behind n-gram language models in an algorithmic fashion in order to extract implicit information about words in a sentence from the context of its neighboring words. Now that we have mastered both RNNs and CNNs, we can begin to expand on these techniques in order to construct even more advanced models.

In the next chapter, we will learn how to build models that utilize elements of both convolutional and recurrent neural networks and use them on sequences to perform even more advanced functions, such as text translation. These are known as sequence-to-sequence networks.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Natural Language Processing with PyTorch 1.x
Published in: Jul 2020Publisher: PacktISBN-13: 9781789802740
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Thomas Dop

Thomas Dop is a data scientist at MagicLab, a company that creates leading dating apps, including Bumble and Badoo. He works on a variety of areas within data science, including NLP, deep learning, computer vision, and predictive modeling. He holds an MSc in data science from the University of Amsterdam.
Read more about Thomas Dop