Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Natural Language Understanding with Python

You're reading from  Natural Language Understanding with Python

Product type Book
Published in Jun 2023
Publisher Packt
ISBN-13 9781804613429
Pages 326 pages
Edition 1st Edition
Languages
Author (1):
Deborah A. Dahl Deborah A. Dahl
Profile icon Deborah A. Dahl

Table of Contents (21) Chapters

Preface Part 1: Getting Started with Natural Language Understanding Technology
Chapter 1: Natural Language Understanding, Related Technologies, and Natural Language Applications Chapter 2: Identifying Practical Natural Language Understanding Problems Part 2:Developing and Testing Natural Language Understanding Systems
Chapter 3: Approaches to Natural Language Understanding – Rule-Based Systems, Machine Learning, and Deep Learning Chapter 4: Selecting Libraries and Tools for Natural Language Understanding Chapter 5: Natural Language Data – Finding and Preparing Data Chapter 6: Exploring and Visualizing Data Chapter 7: Selecting Approaches and Representing Data Chapter 8: Rule-Based Techniques Chapter 9: Machine Learning Part 1 – Statistical Machine Learning Chapter 10: Machine Learning Part 2 – Neural Networks and Deep Learning Techniques Chapter 11: Machine Learning Part 3 – Transformers and Large Language Models Chapter 12: Applying Unsupervised Learning Approaches Chapter 13: How Well Does It Work? – Evaluation Part 3: Systems in Action – Applying Natural Language Understanding at Scale
Chapter 14: What to Do If the System Isn’t Working Chapter 15: Summary and Looking to the Future Index Other Books You May Enjoy

Preface

Natural language understanding (NLU) is a technology that structures language so that computer systems can further process it to perform useful applications.

Developers will find that this practical guide enables them to use NLU techniques to develop many kinds of NLU applications, and managers will be able to identify areas where NLU can be applied to solve real problems in their enterprises.

Complete with step-by-step explanations of essential concepts and practical examples, you will begin by learning what NLU is and how it can be applied. You will then learn about the wide range of current NLU techniques, and you will learn about the best situations to apply each one, including the new large language models (LLMs). In the process, you will be introduced to the most useful Python NLU libraries. Not only will you learn the basics of NLU, but you will also learn about many practical issues such as acquiring data, evaluating systems, improving your system’s results, and deploying NLU applications. Most importantly, you will not just learn a rote list of techniques, but you will learn how to take advantage of the vast number of NLU resources on the web in your future work.

Who this book is for

Python developers who are interested in learning about NLU and applying natural language processing (NLP) technology to real problems will get the most from this book, including computational linguists, linguists, data scientists, NLP developers, conversational AI developers, and students of these topics. The earlier chapters will also be interesting for non-technical project managers.

Working knowledge of Python is required to get the best from this book. You do not need any previous knowledge of NLU.

What this book covers

This book includes fifteen chapters that will take you through a process that starts from understanding what NLU is, through selecting applications, developing systems, and figuring out how to improve a system you have developed.

Chapter 1, Natural Language Understanding, Related Technologies, and Natural Language Applications, provides an explanation of what NLU is, and how it differs from related technologies such as speech recognition.

Chapter 2, Identifying Practical Natural Language Understanding Problems, systematically goes through a wide range of potential applications of NLU and reviews the specific requirements of each type of application. It also reviews aspects of an application that might make it difficult for the current state of the art.

Chapter 3, Approaches to Natural Language Understanding – Rule-Based Systems, Machine Learning, and Deep Learning, provides an overview of the main approaches to NLU and discusses their benefits and drawbacks, including rule-based techniques, statistical techniques, and deep learning. It also discusses popular pre-trained models such as BERT and its variants. Finally, it discusses combining different approaches into a solution.

Chapter 4, Selecting Libraries and Tools for Natural Language Understanding, helps you get set up to process natural language. It begins by discussing general tools such as Jupyter Labs and GitHub, and how to install and use them. It then goes on to discuss installing Python and the many available Python libraries that are available for NLU. Libraries that are discussed include NLTK, spaCy, and TensorFlow/Keras.

Chapter 5, Natural Language Data – Finding and Preparing Data, teaches you how to identify and prepare data for processing with NLU techniques. It discusses data from databases, the web, and other documents as well as privacy and ethics considerations. The Wizard of Oz technique and other simulated data acquisition approaches, such as generation, are covered briefly. For those of you who don’t have access to your own data, or to those who wish to compare their results to those of other researchers, this chapter also discusses generally available and frequently used corpora. It then goes on to discuss preprocessing steps such as stemming and lemmatization.

Chapter 6, Exploring and Visualizing Data, discusses exploratory techniques for getting an overall picture of the data such as summary statistics (word frequencies, category frequencies, and so on). It will also discuss visualization tools such as matplotlib. Finally, it discusses the kinds of decisions that can be made based on visualization and statistical results.

Chapter 7, Selecting Approaches and Representing Data, discusses considerations for selecting approaches, for example, amount of data, training resources, and intended application. This chapter also discusses representing language with such techniques as vectors and embeddings in preparation for quantitative processing. It also discusses combining multiple approaches through the use of pipelines.

Chapter 8, Rule-Based Techniques, discusses how to apply rule-based techniques to specific applications. Examples include regular expressions, lemmatization, syntactic parsing, semantic role assignment and ontologies. This chapter primarily uses the NLTK libraries.

Chapter 9, Machine Learning Part 1 - Statistical Machine Learning, discusses how to apply statistical machine techniques such as Naïve Bayes, TF-IDF, support vector machines and conditional random fields to tasks such as classification, intent recognition, and entity extraction. The emphasis will be on newer techniques such as SVM and how they provide improved performance over more traditional approaches.

Chapter 10, Machine Learning Part 2 – Neural Networks and Deep Learning Techniques, covers applying machine learning techniques based on neural networks (fully connected networks, CNNs and RNNs) to problems such classification and information extraction. The chapter compares results using these approaches to the approaches described in the previous chapter. The chapter discusses neural net concepts such as hyperparameters, learning rate, and training iterations. This chapter uses the TensorFlow/Keras libraries.

Chapter 11, Machine Learning Part 3 – Transformers and Large Language Models, covers the currently best-performing techniques in natural language processing – transformers and pretrained models. It discusses the insights behind transformers and include an example of using transformers for text classification. Code for this chapter is based on the TensorFlow/Keras Python libraries.

Chapter 12, Applying Unsupervised Learning Approaches, discusses applications of unsupervised learning, such as topic modeling, including the value of unsupervised learning for exploratory applications and maximizing scarce data. It also addresses types of partial supervision such as weak supervision and distant supervision.

Chapter 13, How Well Does It Work? – Evaluation, covers quantitative evaluation. This includes segmenting the data into training, validation and test data, evaluation with cross-validation, evaluation metrics such as precision and recall, area under the curve, ablation studies, statistical significance, and user testing.

Chapter 14, What to Do If the System Isn’t Working, discusses system maintenance. If the original model isn’t adequate or if the situation in the real world changes, how does the model have to be changed? The chapter discusses adding new data and changing the structure of the application while at the same time ensuring that new data doesn’t degrade the performance of the existing system.

Chapter 15, Summary and Looking to the Future, provides an overview of the book and a look to the future. It discusses where there is potential for improvement in performance as well as faster training, more challenging applications, and future directions for technology as well as research in this exciting technology.

To get the most out of this book

The code for this book is provided in the form of Jupyter Notebooks. To run the notebooks, you should have a comfortable understanding of coding in Python and be familiar with some basic libraries. Additionally, you’ll need to install the required packages.

The easiest way to install them is by using Pip, a great package manager for Python. If Pip is not yet installed on your system, you can find the installation instructions here: https://pypi.org/project/pip/.

Working knowledge of the Python programming language will assist with understanding the key concepts covered in this book. The examples in this book don’t require GPUs and can run on CPUs, although some of the more complex machine learning examples would run faster on a computer with a GPU.

The code for this book has only been tested on Windows 11 (64-bit).

Software/hardware used in the book

Operating system requirements

Basic platform tools

Python 3.9

Windows, macOS, or Linux

Jupyter Notebooks

pip

Natural Language Processing and Machine Learning

NLTK

Windows, macOS, or Linux

spaCy and displaCy

Keras

TensorFlow

Scikit-learn

Graphing and visualization

Matplotlib

Windows, macOS, or Linux

Seaborn

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Natural-Language-Understanding-with-Python. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://packt.link/HrkNr.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “We’ll model the adjacency matrix using the ENCOAdjacencyDistributionModule object.”

A block of code is set as follows:

preds = causal_bert.inference(
    texts=df['text'],
    confounds=df['has_photo'],
)[0]

Any command-line input or output is written as follows:

$ pip install demoji

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “Select System info from the Administration panel.”

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share your thoughts

Once you’ve read Natural Language Understanding with Python, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

  1. Scan the QR code or visit the link below

https://packt.link/free-ebook/9781804613429

2. Submit your proof of purchase

3. That’s it! We’ll send your free PDF and other benefits to your email directly

lock icon The rest of the chapter is locked
Next Chapter arrow right
You have been reading a chapter from
Natural Language Understanding with Python
Published in: Jun 2023 Publisher: Packt ISBN-13: 9781804613429
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}