Reader small image

You're reading from  Python Data Science Essentials. - Third Edition

Product typeBook
Published inSep 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789537864
Edition3rd Edition
Languages
Concepts
Right arrow
Author (1)
Alberto Boschetti
Alberto Boschetti
author image
Alberto Boschetti

Alberto Boschetti is a data scientist with expertise in signal processing and statistics. He holds a Ph.D. in telecommunication engineering and currently lives and works in London. In his work projects, he faces challenges ranging from natural language processing (NLP) and behavioral analysis to machine learning and distributed processing. He is very passionate about his job and always tries to stay updated about the latest developments in data science technologies, attending meet-ups, conferences, and other events.
Read more about Alberto Boschetti

Right arrow

A peek into natural language processing (NLP)

This section is not strictly related to machine learning, but it contains some machine learning results in the area of natural language processing. Python has many packages to process text data, and one of most powerful and complete toolkit for text processing is NLTK, the Natural Language Tool Kit.

Other NLP toolkits available for the Python community are gensim (https://radimrehurek.com/gensim/) and spaCy (https://spacy.io/)

In the following sections, we'll explore NLTK core functionalities. We will work on the English language; for other languages, you will first need to download the language corpora (note that sometimes languages have no free open source corpora for NLTK).

Please refer to the official website of NLTK data, http://www.nltk.org/nltk_data/, to have access to corpora and lexical resources in many languages, ready...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Python Data Science Essentials. - Third Edition
Published in: Sep 2018Publisher: PacktISBN-13: 9781789537864

Author (1)

author image
Alberto Boschetti

Alberto Boschetti is a data scientist with expertise in signal processing and statistics. He holds a Ph.D. in telecommunication engineering and currently lives and works in London. In his work projects, he faces challenges ranging from natural language processing (NLP) and behavioral analysis to machine learning and distributed processing. He is very passionate about his job and always tries to stay updated about the latest developments in data science technologies, attending meet-ups, conferences, and other events.
Read more about Alberto Boschetti