Reader small image

You're reading from  Natural Language Processing with TensorFlow

Product typeBook
Published inMay 2018
Reading LevelBeginner
PublisherPackt
ISBN-139781788478311
Edition1st Edition
Languages
Right arrow
Authors (2):
Thushan Ganegedara
Thushan Ganegedara
author image
Thushan Ganegedara

Thushan is a seasoned ML practitioner with 4+ years of experience in the industry. Currently he is a senior machine learning engineer at Canva; an Australian startup that founded the online visual design software, Canva, serving millions of customers. His efforts are particularly concentrated in the search and recommendations group working on both visual and textual content. Prior to Canva, Thushan was a senior data scientist at QBE Insurance; an Australian Insurance company. Thushan was developing ML solutions for use-cases related to insurance claims. He also led efforts in developing a Speech2Text pipeline there. He obtained his PhD specializing in machine learning from the University of Sydney in 2018.
Read more about Thushan Ganegedara

View More author details
Right arrow

Improving LSTMs – generating text with words instead of n-grams


Here we will discuss ways to improve LSTMs. First, we will discuss how the number of model parameters grows if we use one-hot-encoded word features. This motivates us to use low-dimensional word vectors instead of one-hot-encoded vectors. Finally, we will discuss how we can employ word vectors in the code to generate better-quality text compared to using bigrams. The code for this section is available in lstm_word2vec.ipynb in the ch8 folder.

The curse of dimensionality

One major limitation stopping us from using words instead of n-grams as the input to our LSTM is that this will drastically increase the number of parameters in our model. Let's understand this through an example. Consider that we have an input of size 500 and a cell state of size 100. This would result in a total of approximately 240K parameters (excluding the softmax layer), as shown here:

Let's now increase the size of the input to 1000. Now the total number...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Natural Language Processing with TensorFlow
Published in: May 2018Publisher: PacktISBN-13: 9781788478311

Authors (2)

author image
Thushan Ganegedara

Thushan is a seasoned ML practitioner with 4+ years of experience in the industry. Currently he is a senior machine learning engineer at Canva; an Australian startup that founded the online visual design software, Canva, serving millions of customers. His efforts are particularly concentrated in the search and recommendations group working on both visual and textual content. Prior to Canva, Thushan was a senior data scientist at QBE Insurance; an Australian Insurance company. Thushan was developing ML solutions for use-cases related to insurance claims. He also led efforts in developing a Speech2Text pipeline there. He obtained his PhD specializing in machine learning from the University of Sydney in 2018.
Read more about Thushan Ganegedara