All Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletters

Free Learning

You're reading from Natural Language Processing and Computational Linguistics

Product type Book

Published in Jun 2018

Publisher Packt

ISBN-13 9781788838535

Pages 306 pages

Edition 1st Edition

Languages

Python

Concepts

Mobile Application Development

Author (1):

Bhargav Srinivasa-Desikan

Table of Contents (22) Chapters

Title Page

Packt Upsell

Contributors

Preface

What is Text Analysis?

Python Tips for Text Analysis

spaCy's Language Models

Gensim – Vectorizing Text and Transformations and n-grams

POS-Tagging and Its Applications

NER-Tagging and Its Applications

Dependency Parsing

Topic Models

Advanced Topic Modeling

Clustering and Classifying Text

Similarity Queries and Summarization

Word2Vec, Doc2Vec, and Gensim

Deep Learning for Text

Keras and spaCy for Deep Learning

Sentiment Analysis and ChatBots

Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Chapter 14. Keras and spaCy for Deep Learning

In the previous chapter we introduced you to deep learning techniques for text, and to get a taste of using neural networks, we attempted to generate text using an RNN. In this chapter, we will take a closer look at deep learning for text, and in particular, how to set up a Keras model that can perform classification, as well as how to incorporate deep learning into spaCy pipelines.

Here are few useful links:

Keras Sequential model [1]
Keras CNN LSTM [2]
Pre-trained word embeddings [3]

Keras and spaCy

In the previous chapter, we already discussed various deep learning frameworks - in this chapter, we will discuss a little more in detail about one, in particular, Keras, while also exploring how we can use deep learning with spaCy.

During our attempts at text generation, we already used Keras, but did not explain the motivation behind using the library, or indeed even how or why we constructed our model the way we did. We will attempt to demystify this, as well as set up a neural network model that will aid us in text classification.

In our brief review of the various deep learning frameworks available in Python, we described Keras as a high-level library which allows us to easily construct neural networks.

Fig 14.1 The arXiv mentions of Keras. arXiv is a website where researchers upload research papers before it is accepted by a journal. Here, the x-axes are the different Python deep learning libraries, and the y-axis is the number of references of that library by the papers...

Classification with Keras

For our experiments, we will be using the IMDB sentiment classification task. This is quite the small dataset - we are using it for the convenience of loading it and using it, as it is easily available via Keras. It is very important to understand here that for datasets of the size we are using, it is not the best idea to use a Deep Neural Network for classification - indeed, we might even get better results with a simple bag of words followed by a Support Vector Machine (SVM) doing the classification. The purpose of the following examples is to rather allow the user to understand how to construct a neural network using Keras, and how to make predictions using it. The fine tuning of the neural network and studying its hyperparameters is a different ball game altogether and is not the focus of this chapter. Another thing to remember when working with text data and neural networks is that in almost all cases, more data is better and that neural networks are far better...

Classification with spaCy

While Keras works especially well in standalone text classification tasks, sometimes it might be useful to use Keras in tandem with spaCy, which works exceedingly well in text analysis. In Chapter 3, spaCy's Language Models, Chapter 5, POS-Tagging and Its Applications, Chapter 6, NER-Tagging and Its Applications, and Chapter 7, Dependency Parsing, we already saw how well spaCy works with textual data, and it is no exception when it comes to deep learning – its text oriented approach makes it easy to build a classifier that works well with text. There are two ways to perform text classification with spaCy – one is using its own neural network library, thinc, while the other uses Keras. Both the examples we will explain are from spaCy's documentation, and it is highly recommended that you check out the original examples!

The first example we will be exploring can be found on the spaCy example page, and is titled deep_learning_keras.py [20]. In the example, we use an...

Summary

In the previous chapter we introduced our readers to deep learning for text, and in this chapter, we saw how we can leverage the power of deep learning in our own applications, whether we use Keras or spaCy. Knowing how to assign sentiment scores or classify our documents gives us a huge boost when designing intelligent text systems, and with pretrained models, we don't have to perform heavy computations every time we wish to make such a classification. It is now within our capacity to build a strong and varied text analysis pipeline!

In the next chapter, we will discuss two popular text analysis problems—sentiment analysis and building our own chatbot—and what possible approaches we can take to solve these problems.

Keras and spaCy for Deep Learning

Here are few useful links:

Keras Sequential model [1]
Keras CNN LSTM [2]
Pre-trained word embeddings [3]

The rest of the chapter is locked

You have been reading a chapter from

Natural Language Processing and Computational Linguistics

Published in: Jun 2018 Publisher: Packt ISBN-13: 9781788838535

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime}

Authors (1)

Bhargav Srinivasa-Desikan

Bhargav Srinivasa-Desikan is a research engineer working for INRIA in Lille, France. He is a part of the MODAL (Models of Data Analysis and Learning) team, and he works on metric learning, predictor aggregation, and data visualization. He is a regular contributor to the Python open source community, and completed Google Summer of Code in 2016 with Gensim where he implemented Dynamic Topic Models. He is a regular speaker at PyCons and PyDatas across Europe and Asia, and conducts tutorials on text analysis using Python.

See other products by Bhargav Srinivasa-Desikan