Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Mastering Text Mining with R

You're reading from  Mastering Text Mining with R

Product type Book
Published in Dec 2016
Publisher Packt
ISBN-13 9781783551811
Pages 258 pages
Edition 1st Edition
Languages
Concepts
Author (1):
KUMAR ASHISH KUMAR ASHISH
Profile icon KUMAR ASHISH

Named entity recognition


Named entity recognition in a sub process in the natural language processing pipeline. We identify the names and numbers from the input document. The names can be names of a person or company, location numbers can be money or percentages, to name a few. In order to perform named entity recognition, we will use Apache OpenNLP TokenNameFinderModel API. In order to invoke the code from the R environment, we will use the OpenNLP R package:

  1. Load the required libraries:

    library(rJava)
    library(NLP)
    library(openNLP)
  2. Create a sample text; we will extract the entities from this text:

    txt <- " IBM is an MNC with headquarters in New York. Oracle is a cloud company in California. James works in IBM. Oracle hired John for cloud expertise. They give 100% to their profession"
  3. We will convert it to string for processing:

    txt_str <- as.String(txt)
  4. We will process the text through the MaxEnt sentence token annotator and the MaxEnt word token annotator, both available in r packages and...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime}