Reader small image

You're reading from  Natural Language Processing and Computational Linguistics

Product typeBook
Published inJun 2018
Reading LevelBeginner
PublisherPackt
ISBN-139781788838535
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Bhargav Srinivasa-Desikan
Bhargav Srinivasa-Desikan
author image
Bhargav Srinivasa-Desikan

Bhargav Srinivasa-Desikan is a research engineer working for INRIA in Lille, France. He is a part of the MODAL (Models of Data Analysis and Learning) team, and he works on metric learning, predictor aggregation, and data visualization. He is a regular contributor to the Python open source community, and completed Google Summer of Code in 2016 with Gensim where he implemented Dynamic Topic Models. He is a regular speaker at PyCons and PyDatas across Europe and Asia, and conducts tutorials on text analysis using Python.
Read more about Bhargav Srinivasa-Desikan

Right arrow

Chapter 7. Dependency Parsing

We saw in Chapter 5, POS-Tagging and Its Applications and Chapter 6, NER-Tagging and Its Applications, how spaCy's language pipeline performs a variety of complex computational linguistics algorithms, such as POS-tagging and NER-tagging. This isn't all spaCy packs though, and in this chapter, we will explore the power of dependency parsing and how it can be used in a variety of contexts and applications. We will have a look at the theory of dependency parsing before moving on to using it with spaCy, as well as training our own dependency parsers. Following are the topics we will cover in this chapter:

  • Dependency parsing
  • Dependency parsing with Python
  • Training our dependency parsers
  • Summary
  • References

Dependency parsing with spaCy

If you've followed every chapter of this book until this one, you would already have finished dependency parsing your data, multiple times; each run of your text through the pipeline had already annotated the words in the sentences in your document with their dependencies to the other words in the sentence. Let's set-up our models again, similar to how we did in the previous chapters.

import spacy
nlp = spacy.load('en')

Now that our pipeline is ready, we can begin analyzing our sentences.

spaCy's parsing portion of the pipeline does both phrasal parsing and dependency parsing - this means that we can get information about what the noun and verb chunks in a sentence are, as well as information about the dependencies between words.

Phrasal parsing can also be referred to as chunking, as we get chunks that are part of sentences...

Training our dependency parsers

Again, if you have read Chapter 4, Gensim - Vectorizing Text and Transformations and n-grams, Chapter 5, POS-Tagging and Its applications, and Chapter 6, NER-Tagging and Its applications, then you would be comfortable with the theory behind training our own models in spaCy. We would recommend that you go back and read Vector transformations in Gensim section from chapter 4 and Training our own POS-taggers section from chapter 5 to refresh your ideas on what exactly training means in context with machine learning and in particular, spaCy.

Again, the advantage with spaCy is that we don't need to care about the algorithm being used under the hood, or which features are the best to select for dependency parsing - this is usually the hardest part of machine learning research. We know that an optimal learning algorithm has been selected, and all...

Summary

This brings us to the end of our chapter on spaCy and dependency parsing. The previous four chapters have illustrated the many powers of spaCy, and how we can harness these powers. Dependency parsing, in particular, remains very important to us as finding semantic or syntactic relationships between words within sentences can have many uses, whether it is simply identifying the most used adjectives or adverbs for a particular word or mapping custom relationships.

In the next chapters, we will move on from computational linguistics-based algorithms to information retrieval-based algorithms to conduct our text analysis. In particular, this will be topic models as well as clustering and classification algorithms.

Summary


This brings us to the end of our chapter on spaCy and dependency parsing. The previous four chapters have illustrated the many powers of spaCy, and how we can harness these powers. Dependency parsing, in particular, remains very important to us as finding semantic or syntactic relationships between words within sentences can have many uses, whether it is simply identifying the most used adjectives or adverbs for a particular word or mapping custom relationships.

In the next chapters, we will move on from computational linguistics-based algorithms to information retrieval-based algorithms to conduct our text analysis. In particular, this will be topic models as well as clustering and classification algorithms.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Natural Language Processing and Computational Linguistics
Published in: Jun 2018Publisher: PacktISBN-13: 9781788838535
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Bhargav Srinivasa-Desikan

Bhargav Srinivasa-Desikan is a research engineer working for INRIA in Lille, France. He is a part of the MODAL (Models of Data Analysis and Learning) team, and he works on metric learning, predictor aggregation, and data visualization. He is a regular contributor to the Python open source community, and completed Google Summer of Code in 2016 with Gensim where he implemented Dynamic Topic Models. He is a regular speaker at PyCons and PyDatas across Europe and Asia, and conducts tutorials on text analysis using Python.
Read more about Bhargav Srinivasa-Desikan