Switch to the store?

Natural Language Processing: Python and NLTK

More Information
Learn
  • The scope of natural language complexity and how they are processed by machines
  • Clean and wrangle text using tokenization and chunking to help you process data better
  • Tokenize text into sentences and sentences into words
  • Classify text and perform sentiment analysis
  • Implement string matching algorithms and normalization techniques
  • Understand and implement the concepts of information retrieval and text summarization
  • Find out how to implement various NLP tasks in Python
About

Natural Language Processing is a field of computational linguistics and artificial intelligence that deals with human-computer interaction. It provides a seamless interaction between computers and human beings and gives computers the ability to understand human speech with the help of machine learning. The number of human-computer interaction instances are increasing so it’s becoming imperative that computers comprehend all major natural languages.

The first NLTK Essentials module is an introduction on how to build systems around NLP, with a focus on how to create a customized tokenizer and parser from scratch. You will learn essential concepts of NLP, be given practical insight into open source tool and libraries available in Python, shown how to analyze social media sites, and be given tools to deal with large scale text. This module also provides a workaround using some of the amazing capabilities of Python libraries such as NLTK, scikit-learn, pandas, and NumPy.

The second Python 3 Text Processing with NLTK 3 Cookbook module teaches you the essential techniques of text and language processing with simple, straightforward examples. This includes organizing text corpora, creating your own custom corpus, text classification with a focus on sentiment analysis, and distributed text processing methods.

The third Mastering Natural Language Processing with Python module will help you become an expert and assist you in creating your own NLP projects using NLTK. You will be guided through model development with machine learning tools, shown how to create training data, and given insight into the best practices for designing and building NLP-based applications using Python.

This Learning Path combines some of the best that Packt has to offer in one complete, curated package and is designed to help you quickly learn text processing with Python and NLTK. It includes content from the following Packt products:

  • NTLK essentials by Nitin Hardeniya
  • Python 3 Text Processing with NLTK 3 Cookbook by Jacob Perkins
  • Mastering Natural Language Processing with Python by Deepti Chopra, Nisheeth Joshi, and Iti Mathur
Features
  • Break text down into its component parts for spelling correction, feature extraction, and phrase transformation 
  • Work through NLP concepts with simple and easy-to-follow programming recipes 
  • Gain insights into the current and budding research topics of NLP
Page Count 702
Course Length 21 hours 3 minutes
ISBN 9781787285101
Date Of Publication 22 Nov 2016

Authors

Jacob Perkins

Jacob Perkins is the cofounder and CTO of Weotta, a local search company. Weotta uses NLP and machine learning to create powerful and easy-to-use natural language search for what to do and where to go.

He is the author of Python Text Processing with NLTK 2.0 Cookbook, Packt Publishing, and has contributed a chapter to the Bad Data Handbook, O'Reilly Media. He writes about NLTK, Python, and other technology topics at http://streamhacker.com.

To demonstrate the capabilities of NLTK and natural language processing, he developed http://text-processing.com, which provides simple demos and NLP APIs for commercial use. He has contributed to various open source projects, including NLTK, and created NLTK-Trainer to simplify the process of training NLTK models. For more information, visit https://github.com/japerk/nltk-trainer.

Nitin Hardeniya

Nitin Hardeniya is a data scientist with more than 4 years of experience working with companies such as Fidelity, Groupon, and [24]7-inc. He has worked on a variety of business problems across different domains. He holds a master's degree in computational linguistics from IIIT-H. He is the author of 5 patents in the field of customer experience.

He is passionate about language processing and large unstructured data. He has been using Python for almost 5 years in his day-to-day work. He believes that Python could be a single-point solution to most of the problems related to data science.

He has put on his hacker's hat to write this book and has tried to give you an introduction to all the sophisticated tools related to NLP and machine learning in a very simplified form. In this book, he has also provided a workaround using some of the amazing capabilities of Python libraries, such as NLTK, scikit-learn, pandas, and NumPy.

Deepti Chopra

Deepti Chopra is an Assistant Professor at Banasthali University. Her primary area of research is computational linguistics, Natural Language Processing, and artificial intelligence. She is also involved in the development of MT engines for English to Indian languages. She has several publications in various journals and conferences and also serves on the program committees of several conferences and journals.

Nisheeth Joshi

Nisheeth Joshi is an associate professor and a researcher at Banasthali University. He has also done a PhD in Natural Language Processing. He is an expert with the TDIL Program, Department of IT, Government of India, the premier organization overseeing language technology funding and research in India. He has several publications to his name in various journals and conferences, and also serves on the program committees and editorial boards of several conferences and journals.

Iti Mathur

Iti Mathur is an Assistant Professor at Banasthali University. Her areas of interest are computational semantics and ontological engineering. Besides this, she is also involved in the development of MT engines for English to Indian languages. She is one of the experts empaneled with TDIL program, Department of Electronics and Information Technology (DeitY), Govt. of India, a premier organization that oversees Language Technology Funding and Research in India. She has several publications in various journals and conferences and also serves on the program committees and editorial boards of several conferences and journals.