Reader small image

You're reading from  Hands-On Natural Language Processing with PyTorch 1.x

Product typeBook
Published inJul 2020
Reading LevelBeginner
PublisherPackt
ISBN-139781789802740
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Thomas Dop
Thomas Dop
author image
Thomas Dop

Thomas Dop is a data scientist at MagicLab, a company that creates leading dating apps, including Bumble and Badoo. He works on a variety of areas within data science, including NLP, deep learning, computer vision, and predictive modeling. He holds an MSc in data science from the University of Amsterdam.
Read more about Thomas Dop

Right arrow

TF-IDF

TF-IDF is yet another technique we can learn about to better represent natural language. It is often used in text mining and information retrieval to match documents based on search terms, but can also be used in combination with embeddings to better represent sentences in embedding form. Let's take the following phrase:

This is a small giraffe

Let's say we want a single embedding to represent the meaning of this sentence. One thing we could do is simply average the individual embeddings of each of the five words in this sentence:

Figure 3.28 – Word embeddings

However, this methodology assigns equal weight to all the words in the sentence. Do you think that all the words contribute equally to the meaning of the sentence? This and a are very common words in the English language, but giraffe is very rarely seen. Therefore, we might want to assign more weight to the rarer words. This methodology is known as Term Frequency –...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Hands-On Natural Language Processing with PyTorch 1.x
Published in: Jul 2020Publisher: PacktISBN-13: 9781789802740

Author (1)

author image
Thomas Dop

Thomas Dop is a data scientist at MagicLab, a company that creates leading dating apps, including Bumble and Badoo. He works on a variety of areas within data science, including NLP, deep learning, computer vision, and predictive modeling. He holds an MSc in data science from the University of Amsterdam.
Read more about Thomas Dop