Reader small image

You're reading from  Natural Language Processing with TensorFlow - Second Edition

Product typeBook
Published inJul 2022
Reading LevelIntermediate
PublisherPackt
ISBN-139781838641351
Edition2nd Edition
Languages
Right arrow
Author (1)
Thushan Ganegedara
Thushan Ganegedara
author image
Thushan Ganegedara

Thushan is a seasoned ML practitioner with 4+ years of experience in the industry. Currently he is a senior machine learning engineer at Canva; an Australian startup that founded the online visual design software, Canva, serving millions of customers. His efforts are particularly concentrated in the search and recommendations group working on both visual and textual content. Prior to Canva, Thushan was a senior data scientist at QBE Insurance; an Australian Insurance company. Thushan was developing ML solutions for use-cases related to insurance claims. He also led efforts in developing a Speech2Text pipeline there. He obtained his PhD specializing in machine learning from the University of Sydney in 2018.
Read more about Thushan Ganegedara

Right arrow

Transformer architecture

A Transformer is a type of Seq2Seq model (discussed in the previous chapter). Transformer models can work with both image and text data. The Transformer model takes in a sequence of inputs and maps that to a sequence of outputs.

The Transformer model was initially proposed in the paper Attention is all you need by Vaswani et al. (https://arxiv.org/pdf/1706.03762.pdf). Just like a Seq2Seq model, the Transformer consists of an encoder and a decoder (Figure 10.1):

Intuition behind NMT

Figure 10.1: The encoder-decoder architecture

Let’s understand how the Transformer model works using the previously studied Machine Translation task. The encoder takes in a sequence of source language tokens and produces a sequence of interim outputs. Then the decoder takes in a sequence of target language tokens and predicts the next token for each time step (the teacher forcing technique). Both the encoder and the decoder use attention mechanisms to improve performance. For...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Natural Language Processing with TensorFlow - Second Edition
Published in: Jul 2022Publisher: PacktISBN-13: 9781838641351

Author (1)

author image
Thushan Ganegedara

Thushan is a seasoned ML practitioner with 4+ years of experience in the industry. Currently he is a senior machine learning engineer at Canva; an Australian startup that founded the online visual design software, Canva, serving millions of customers. His efforts are particularly concentrated in the search and recommendations group working on both visual and textual content. Prior to Canva, Thushan was a senior data scientist at QBE Insurance; an Australian Insurance company. Thushan was developing ML solutions for use-cases related to insurance claims. He also led efforts in developing a Speech2Text pipeline there. He obtained his PhD specializing in machine learning from the University of Sydney in 2018.
Read more about Thushan Ganegedara