Reader small image

You're reading from  Hands-On Natural Language Processing with PyTorch 1.x

Product typeBook
Published inJul 2020
Reading LevelBeginner
PublisherPackt
ISBN-139781789802740
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Thomas Dop
Thomas Dop
author image
Thomas Dop

Thomas Dop is a data scientist at MagicLab, a company that creates leading dating apps, including Bumble and Badoo. He works on a variety of areas within data science, including NLP, deep learning, computer vision, and predictive modeling. He holds an MSc in data science from the University of Amsterdam.
Read more about Thomas Dop

Right arrow

The theory of attention within neural networks

In the previous chapter, in our sequence-to-sequence model for sentence translation (with no attention implemented), we used both encoders and decoders. The encoder obtained a hidden state from the input sentence, which was a representation of our sentence. The decoder then used this hidden state to perform the translation steps. A basic graphical illustration of this is as follows:

Figure 8.1 – Graphical representation of sequence-to-sequence models

However, decoding over the entirety of the hidden state is not necessarily the most efficient way of using this task. This is because the hidden state represents the entirety of the input sentence; however, in some tasks (such as predicting the next word in a sentence), we do not need to consider the entirety of the input sentence, just the parts that are relevant to the prediction we are trying to make. We can show that by using attention within our sequence...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Hands-On Natural Language Processing with PyTorch 1.x
Published in: Jul 2020Publisher: PacktISBN-13: 9781789802740

Author (1)

author image
Thomas Dop

Thomas Dop is a data scientist at MagicLab, a company that creates leading dating apps, including Bumble and Badoo. He works on a variety of areas within data science, including NLP, deep learning, computer vision, and predictive modeling. He holds an MSc in data science from the University of Amsterdam.
Read more about Thomas Dop