Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Python 3 Text Processing with NLTK 3 Cookbook

You're reading from  Python 3 Text Processing with NLTK 3 Cookbook

Product type Book
Published in Aug 2014
Publisher
ISBN-13 9781782167853
Pages 304 pages
Edition 1st Edition
Languages
Author (1):
Jacob Perkins Jacob Perkins
Profile icon Jacob Perkins

Table of Contents (17) Chapters

Python 3 Text Processing with NLTK 3 Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Tokenizing Text and WordNet Basics Replacing and Correcting Words Creating Custom Corpora Part-of-speech Tagging Extracting Chunks Transforming Chunks and Trees Text Classification Distributed Processing and Handling Large Datasets Parsing Specific Data Types Penn Treebank Part-of-speech Tags
Index

Affix tagging


The AffixTagger class is another ContextTagger subclass, but this time the context is either the prefix or the suffix of a word. This means the AffixTagger class is able to learn tags based on fixed-length substrings of the beginning or ending of a word.

How to do it...

The default arguments for an AffixTagger class specify three-character suffixes, and that words must be at least five characters long. If a word is less than five characters, then None is returned as the tag.

>>> from nltk.tag import AffixTagger
>>> tagger = AffixTagger(train_sents)
>>> tagger.evaluate(test_sents)
0.27558817181092166

So, it does ok by itself with the default arguments. Let's try it by specifying three-character prefixes.

>>> prefix_tagger = AffixTagger(train_sents, affix_length=3)
>>> prefix_tagger.evaluate(test_sents)
0.23587308439456076

To learn on two-character suffixes, the code will look like this:

>>> suffix_tagger = AffixTagger(train_sents...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}