Search icon
Subscription
0
Cart icon
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Generative AI with Python and TensorFlow 2

You're reading from  Generative AI with Python and TensorFlow 2

Product type Book
Published in Apr 2021
Publisher Packt
ISBN-13 9781800200883
Pages 488 pages
Edition 1st Edition
Languages
Authors (2):
Joseph Babcock Joseph Babcock
Raghav Bali Raghav Bali
View More author details

Table of Contents (16) Chapters

Preface 1. An Introduction to Generative AI: "Drawing" Data from Models 2. Setting Up a TensorFlow Lab 3. Building Blocks of Deep Neural Networks 4. Teaching Networks to Generate Digits 5. Painting Pictures with Neural Networks Using VAEs 6. Image Generation with GANs 7. Style Transfer with GANs 8. Deepfakes with GANs 9. The Rise of Methods for Text Generation 10. NLP 2.0: Using Transformers to Generate Text 11. Composing Music with Generative Models 12. Play Video Games with Generative AI: GAIL 13. Emerging Applications in Generative AI 14. Other Books You May Enjoy
15. Index

Attention

The LSTM-based architecture we used to prepare our first language model for text generation had one major limitation. The RNN layer (generally speaking, it could be LSTM, or GRU, etc.) takes in a context window of a defined size as input and encodes all of it into a single vector. This bottleneck vector needs to capture a lot of information in itself before the decoding stage can use it to start generating the next token.

Attention is one of the most powerful concepts in the deep learning space that really changed the game. The core idea behind the attention mechanism is to make use of all interim hidden states of the RNN to decide which one to focus upon before it is used by the decoding stage. A more formal way of presenting attention is:

Given a vector of values (all the hidden states of the RNN) and a query vector (this could be the decoder state), attention is a technique to compute a weighted sum of the values, dependent on the query.

The weighted sum acts...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}