Reader small image

You're reading from  Mastering Data Mining with Python - Find patterns hidden in your data

Product typeBook
Published inAug 2016
Reading LevelIntermediate
Publisher
ISBN-139781785889950
Edition1st Edition
Languages
Concepts
Right arrow
Author (1)
Megan Squire
Megan Squire
author image
Megan Squire

Megan Squire is a professor of computing sciences at Elon University. Her primary research interest is in collecting, cleaning, and analyzing data about how free and open source software is made. She is one of the leaders of the FLOSSmole.org, FLOSSdata.org, and FLOSSpapers.org projects.
Read more about Megan Squire

Right arrow

Summary


In this chapter, we learned about the task of Named Entity Recognition (NER) and how that works in practice. We reviewed the characteristics of a named entity, and compared many strategies for finding named entities in text and classifying found entities into their correct type. We implemented a simple NER program using NLTK and used it to detect named entities in four different types of technical communication: chat, chat summaries, e-mails, and meeting minutes. We calculated the accuracy of our NER program using precision, recall, and the F1-measure against each of these text samples, and learned how the characteristics of the text sample will affect the accuracy of the program.

One of the outcomes of this chapter was to demonstrate that text that is written in plain language with fewer technical terms will be easier to mine for named entities than very technical language with a lot of code snippets, function names, acronyms, and the like. We noticed that we got the best results...

lock icon
The rest of the page is locked
Previous PageNext Chapter
You have been reading a chapter from
Mastering Data Mining with Python - Find patterns hidden in your data
Published in: Aug 2016Publisher: ISBN-13: 9781785889950

Author (1)

author image
Megan Squire

Megan Squire is a professor of computing sciences at Elon University. Her primary research interest is in collecting, cleaning, and analyzing data about how free and open source software is made. She is one of the leaders of the FLOSSmole.org, FLOSSdata.org, and FLOSSpapers.org projects.
Read more about Megan Squire