NLTK Essentials

Build cool NLP and machine learning applications using NLTK and other Python libraries

NLTK Essentials

Nitin Hardeniya

Build cool NLP and machine learning applications using NLTK and other Python libraries
Mapt Subscription
FREE
$29.99/m after trial
eBook
$16.80
RRP $23.99
Save 29%
Print + eBook
$29.99
RRP $29.99
What do I get with a Mapt Pro subscription?
  • Unlimited access to all Packt’s 5,000+ eBooks and Videos
  • Early Access content, Progress Tracking, and Assessments
  • 1 Free eBook or Video to download and keep every month after trial
What do I get with an eBook?
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with Print & eBook?
  • Get a paperback copy of the book delivered to you
  • Download this book in EPUB, PDF, MOBI formats
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
What do I get with a Video?
  • Download this Video course in MP4 format
  • DRM FREE - read and interact with your content when you want, where you want, and how you want
  • Access this title in the Mapt reader
$0.00
$16.80
$29.99
$29.99p/m after trial
RRP $23.99
RRP $29.99
Subscription
eBook
Print + eBook
Start 30 Day Trial
Subscribe and access every Packt eBook & Video.
 
  • 5,000+ eBooks & Videos
  • 50+ New titles a month
  • 1 Free eBook/Video to keep every month
Start Free Trial
 
Preview in Mapt

Book Details

ISBN 139781784396909
Paperback194 pages

Book Description

Natural Language Processing (NLP) is the field of artificial intelligence and computational linguistics that deals with the interactions between computers and human languages. With the instances of human-computer interaction increasing, it’s becoming imperative for computers to comprehend all major natural languages. Natural Language Toolkit (NLTK) is one such powerful and robust tool.

You start with an introduction to get the gist of how to build systems around NLP. We then move on to explore data science-related tasks, following which you will learn how to create a customized tokenizer and parser from scratch. Throughout, we delve into the essential concepts of NLP while gaining practical insights into various open source tools and libraries available in Python for NLP. You will then learn how to analyze social media sites to discover trending topics and perform sentiment analysis. Finally, you will see tools which will help you deal with large scale text.

By the end of this book, you will be confident about NLP and data science concepts and know how to apply them in your day-to-day work.

Table of Contents

Chapter 1: Introduction to Natural Language Processing
Why learn NLP?
Let's start playing with Python!
Diving into NLTK
Your turn
Summary
Chapter 2: Text Wrangling and Cleansing
What is text wrangling?
Text cleansing
Sentence splitter
Tokenization
Stemming
Lemmatization
Stop word removal
Rare word removal
Spell correction
Your turn
Summary
Chapter 3: Part of Speech Tagging
What is Part of speech tagging
Named Entity Recognition (NER)
Your Turn
Summary
Chapter 4: Parsing Structure in Text
Shallow versus deep parsing
The two approaches in parsing
Why we need parsing
Different types of parsers
Dependency parsing
Chunking
Information extraction
Summary
Chapter 5: NLP Applications
Building your first NLP application
Other NLP applications
Summary
Chapter 6: Text Classification
Machine learning
Text classification
Sampling
The Random forest algorithm
Text clustering
Topic modeling in text
References
Summary
Chapter 7: Web Crawling
Web crawlers
Writing your first crawler
Data flow in Scrapy
The Sitemap spider
The item pipeline
External references
Summary
Chapter 8: Using NLTK with Other Python Libraries
NumPy
SciPy
pandas
matplotlib
External references
Summary
Chapter 9: Social Media Mining in Python
Data collection
Data extraction
Geovisualization
Summary
Chapter 10: Text Mining at Scale
Different ways of using Python on Hadoop
NLTK on Hadoop
Scikit-learn on Hadoop
PySpark
Summary

What You Will Learn

  • Get a glimpse of the complexity of natural languages and how they are processed by machines
  • Clean and wrangle text using tokenization and chunking to help you better process data
  • Explore the different types of tags available and learn how to tag sentences
  • Create a customized parser and tokenizer to suit your needs
  • Build a real-life application with features such as spell correction, search, machine translation and a question answering system
  • Retrieve any data content using crawling and scraping
  • Perform feature extraction and selection, and build a classification system on different pieces of texts
  • Use various other Python libraries such as pandas, scikit-learn, matplotlib, and gensim
  • Analyse social media sites to discover trending topics and perform sentiment analysis

Authors

Table of Contents

Chapter 1: Introduction to Natural Language Processing
Why learn NLP?
Let's start playing with Python!
Diving into NLTK
Your turn
Summary
Chapter 2: Text Wrangling and Cleansing
What is text wrangling?
Text cleansing
Sentence splitter
Tokenization
Stemming
Lemmatization
Stop word removal
Rare word removal
Spell correction
Your turn
Summary
Chapter 3: Part of Speech Tagging
What is Part of speech tagging
Named Entity Recognition (NER)
Your Turn
Summary
Chapter 4: Parsing Structure in Text
Shallow versus deep parsing
The two approaches in parsing
Why we need parsing
Different types of parsers
Dependency parsing
Chunking
Information extraction
Summary
Chapter 5: NLP Applications
Building your first NLP application
Other NLP applications
Summary
Chapter 6: Text Classification
Machine learning
Text classification
Sampling
The Random forest algorithm
Text clustering
Topic modeling in text
References
Summary
Chapter 7: Web Crawling
Web crawlers
Writing your first crawler
Data flow in Scrapy
The Sitemap spider
The item pipeline
External references
Summary
Chapter 8: Using NLTK with Other Python Libraries
NumPy
SciPy
pandas
matplotlib
External references
Summary
Chapter 9: Social Media Mining in Python
Data collection
Data extraction
Geovisualization
Summary
Chapter 10: Text Mining at Scale
Different ways of using Python on Hadoop
NLTK on Hadoop
Scikit-learn on Hadoop
PySpark
Summary

Book Details

ISBN 139781784396909
Paperback194 pages
Read More

Read More Reviews

Recommended for You

Python Machine Learning Book Cover
Python Machine Learning
$ 35.99
$ 25.20
Practical Machine Learning Book Cover
Practical Machine Learning
$ 37.99
$ 26.60
Practical Data Science Cookbook Book Cover
Practical Data Science Cookbook
$ 29.99
$ 21.00
Building Machine Learning Systems with Python Book Cover
Building Machine Learning Systems with Python
$ 29.99
$ 6.00
IPython Interactive Computing and Visualization Cookbook Book Cover
IPython Interactive Computing and Visualization Cookbook
$ 29.99
$ 21.00
Python Data Analysis Book Cover
Python Data Analysis
$ 29.99
$ 21.00