Reader small image

You're reading from  Natural Language Processing with Python Quick Start Guide

Product typeBook
Published inNov 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789130386
Edition1st Edition
Languages
Right arrow
Author (1)
Nirant Kasliwal
Nirant Kasliwal
author image
Nirant Kasliwal

Nirant Kasliwal maintains an awesome list of NLP natural language processing resources. GitHub's machine learning collection features this as the go-to guide. Nobel Laureate Dr. Paul Romer found his programming notes on Jupyter Notebooks helpful. Nirant won the first ever NLP Google Kaggle Kernel Award. At Soroco, image segmentation and intent categorization are the challenges he works with. His state-of-the-art language modeling results are available as Hindi2vec.
Read more about Nirant Kasliwal

Right arrow

Preface

Natural language processing (NLP) is the use of machines to manipulate natural language. This book teaches you how to build NLP applications with code and relevant case studies using Python. This book will introduce you to the basic vocabulary and a suggested workflow for building NLP applications to help you get started with popular NLP tasks such as sentiment analysis, entity recognition, part of speech tagging, stemming, and word embeddings.

Who this book is for

This book is for programmers who wish to build systems that can interpret language and who have exposure to Python programming. A familiarity with NLP vocabulary and basics and machine learning would be helpful, but is not mandatory.

What this book covers

Chapter 1, Getting Started with Text Classification, introduces the reader to NLP and what a good NLP workflow looks like. You will also learn how to prepare text for machine learning with scikit-learn.

Chapter 2, Tidying Your Text, discusses some of the most common text pre-processing ideas. You will be introduced to spaCy and will learn how to use it for tokenization, sentence extraction, and lemmatization.

Chapter 3, Leveraging Linguistics, goes into a simple use case and examines how we can solve it. Then, we repeat this task again, but on a slightly different text corpus.

Chapter 4, Text Representations – Words to Numbers, introduces readers to the Gensim API. We will also learn to load pre-trained GloVe vectors and to use these vector representations instead of TD-IDF in any machine learning model.

Chapter 5, Modern Methods for Classification, looks at several new ideas regarding machine learning. The intention here is to demonstrate some of the most common classifiers. We will also learn about concepts such as sentiment analysis, simple classifiers, and how to optimize them for your datasets and ensemble methods.

Chapter 6, Deep Learning for NLP, cover what deep learning is, how it differs from what we have seen, and the key ideas in any deep learning model. We will also look at a few topics regarding PyTorch, how to tokenize text, and what recurrent networks are.

Chapter 7, Building Your Own Chatbot, explains why chatbots should be built and figures out the correct user intent. We will also learn in detail about intent , response, templates, and entities.

Chapter 8, Web Deployments, explains how to train a model and write some neater utils for data I/O. We are going to build a predict function and expose it using a Flask REST endpoint.

To get the most out of this book

  • You will need conda with Python 3.6 or new version
  • A basic understanding to Python programming language will be required
  • NLP or machine learning experience will be helpful but is not mandatory

Download the example code files

You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.

You can download the code files by following these steps:

  1. Log in or register at www.packt.com.
  2. Select the SUPPORT tab.
  3. Click on Code Downloads & Errata.
  4. Enter the name of the book in the Search box and follow the onscreen instructions.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR/7-Zip for Windows
  • Zipeg/iZip/UnRarX for Mac
  • 7-Zip/PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Natural-Language-Processing-with-Python-Quick-Start-Guide. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "I used the sed syntax."

A block of code is set as follows:

url = 'http://www.gutenberg.org/ebooks/1661.txt.utf-8'
file_name = 'sherlock.txt'

Any command-line input or output is written as follows:

import pandas as pd
import numpy as np

Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "The Prediction: pos is actually a result from the file I uploaded to this page earlier."

Warnings or important notes appear like this.
Tips and tricks appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at customercare@packtpub.com.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in, and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Reviews

Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!

For more information about Packt, please visit packt.com.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Natural Language Processing with Python Quick Start Guide
Published in: Nov 2018Publisher: PacktISBN-13: 9781789130386
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Nirant Kasliwal

Nirant Kasliwal maintains an awesome list of NLP natural language processing resources. GitHub's machine learning collection features this as the go-to guide. Nobel Laureate Dr. Paul Romer found his programming notes on Jupyter Notebooks helpful. Nirant won the first ever NLP Google Kaggle Kernel Award. At Soroco, image segmentation and intent categorization are the challenges he works with. His state-of-the-art language modeling results are available as Hindi2vec.
Read more about Nirant Kasliwal