Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Mastering Text Mining with R

You're reading from  Mastering Text Mining with R

Product type Book
Published in Dec 2016
Publisher Packt
ISBN-13 9781783551811
Pages 258 pages
Edition 1st Edition
Languages
Concepts
Author (1):
KUMAR ASHISH KUMAR ASHISH
Profile icon KUMAR ASHISH

Entity extraction


The process of extracting information from unstructured documents is called information extraction. In today's world, most of the data produced over the internet is semi-structured or unstructured; this data is mostly in a human-understandable format, what we call natural language, so most of the time, natural language processing comes into play during information extraction. Entity recognition is a sub process in the chain of information extraction process. NER is one of the important and vital parts of the information extraction process. NER is sometimes also called entity extraction or entity chunking .The main job of NER is to extract the rigid designators in the document and classify these elements in the text to a predefined category. The named entity extractor has a set of predefined categories such as the following:

  • persons

  • organizations

  • locations

  • time

  • money

  • percentages

  • dates

Given an unstructured document, NER will annotate the block or extract the relevant features. Consider...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}