You're reading from Natural Language Understanding with Python

Product typeBook

Published inJun 2023

PublisherPackt

ISBN-139781804613429

Edition1st Edition

Concepts

Machine Learning

Author (1)

Deborah A. Dahl

Machine Learning Part 2 – Neural Networks and Deep Learning Techniques

Neural networks (NNs) have only became popular in natural language understanding (NLU) around 2010 but have since been widely applied to many problems. In addition, there are many applications of NNs to non-natural language processing (NLP) problems such as image classification. The fact that NNs are a general approach that can be applied across different research areas has led to some interesting synergies across these fields.

In this chapter, we will cover the application of machine learning (ML) techniques based on NNs to problems such as NLP classification. We will also cover several different kinds of commonly used NNs—specifically, fully connected multilayer perceptrons (MLPs), convolutional NNs (CNNs), and recurrent NNs (RNNs)—and show how they can be applied to problems such as classification and information extraction. We will also discuss fundamental NN concepts such as hyperparameters...

Basics of NNs

The basic concepts behind NNs have been studied for many years but have only fairly recently been applied to NLP problems on a large scale. Currently, NNs are one of the most popular tools for solving NLP tasks. NNs are a large field and are very actively researched, so we won’t be able to give you a comprehensive understanding of NNs for NLP. However, we will attempt to provide you with some basic knowledge that will let you apply NNs to your own problems.

NNs are inspired by some properties of the animal nervous system. Specifically, animal nervous systems consist of a network of interconnected cells, called neurons, that transmit information throughout the network with the result that, given an input, the network produces an output that represents a decision about the input.

Artificial NNs (ANNs) are designed to model this process in some respects. The decision about how to react to the inputs is determined by a sequence of processing steps starting with...

Example – MLP for classification

We will review basic NN concepts by looking at the MLP, which is conceptually one of the most straightforward types of NNs. The example we will use is the classification of movie reviews into reviews with positive and negative sentiments. Since there are only two possible categories, this is a binary classification problem. We will use the Sentiment Labelled Sentences Data Set (From Group to Individual Labels using Deep Features, Kotzias et al., KDD 2015 https://archive.ics.uci.edu/ml/datasets/Sentiment+Labelled+Sentences), available from the University of California, Irvine. Start by downloading the data and unzipping it into a directory in the same directory as your Python script. You will see a directory called sentiment labeled sentences that contains the actual data in a file called imdb_labeled.txt. You can install the data into another directory of your choosing, but if you do, be sure to modify the filepath_dict variable accordingly.

...

Hyperparameters and tuning

Figure 10.4 clearly shows that increasing the number of training epochs is not going to improve performance on this task. The best validation accuracy seems to be about 80% after 10 epochs. However, 80% accuracy is not very good. How can we improve it? Here are some ideas. None of them is guaranteed to work, but it is worth experimenting with them:

If more training data is available, the amount of training data can be increased.
Preprocessing techniques that can remove noise from the training data can be investigated—for example, stopword removal, removing non-words such as numbers and HTML tags, stemming and lemmatization, and lowercasing. Details on these techniques were covered in Chapter 5.
Changes to the learning rate—for example, lowering the learning rate might improve the ability of the network to avoid local minima.
Decreasing the batch size.
Changing the number of layers and the number of neurons in each layer...

Moving beyond MLPs – RNNs

RNNs are a type of NN that is able to take into account the order of items in an input. In the example of the MLP that was discussed previously, the vector representing the entire input (that is, the complete document) was fed to the NN at once, so the network had no way of taking into account the order of words in the document. However, this is clearly an oversimplification in the case of text data since the order of words can be very important to the meaning. RNNs are able to take into account the order of words by using earlier outputs as inputs to later layers. This can be especially helpful in certain NLP problems where the order of words is very important, such as named entity recognition (NER), part-of-speech (POS) tagging, or slot labeling.

A diagram of a unit of an RNN is shown in Figure 10.5:

Figure 10.5 – A unit of an RNN

The unit is shown at time t. The input at time t, x(t), is passed to the activation...

Looking at another approach – CNNs

CNNs are very popular for image recognition tasks, but they are less often used for NLP tasks than RNNs because they don’t take into account the temporal order of items in the input. However, they can be useful for document classification tasks. As you will recall from earlier chapters, the representations that are often used in classification depend only on the words that occur in the document—BoW and TF-IDF, for example—so, effective classification can often be accomplished without taking word order into account.

To classify documents with CNNs, we can represent a text as an array of vectors, where each word is mapped to a vector in a space made up of the full vocabulary. We can use word2vec, which we discussed in Chapter 7, to represent word vectors. Training a CNN for text classification with Keras is very similar to the training process that we worked through in MLP classification. We create a sequential model as...

Summary

In this chapter, we have explored applications of NNs to document classification in NLP. We covered the basic concepts of NNs, reviewed a simple MLP, and applied it to a binary classification problem. We also provided some suggestions for improving performance by modifying hyperparameters and tuning. Finally, we discussed the more advanced types of NNs—RNNs and CNNs.

In Chapter 11, we will cover the currently best-performing techniques in NLP—transformers and pretrained models.

The rest of the chapter is locked

You have been reading a chapter from

Natural Language Understanding with Python

Published in: Jun 2023Publisher: PacktISBN-13: 9781804613429

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Deborah A. Dahl

Deborah A. Dahl is the principal at Conversational Technologies, with over 30 years of experience in natural language understanding technology. She has developed numerous natural language processing systems for research, commercial, and government applications, including a system for NASA, and speech and natural language components on Android. She has taught over 20 workshops on natural language processing, consulted on many natural language processing applications for her customers, and written over 75 technical papers. Th is is Deborah's fourth book on natural language understanding topics. Deborah has a PhD in linguistics from the University of Minnesota and postdoctoral studies in cognitive science from the University of Pennsylvania.
Read more about Deborah A. Dahl

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages