Reader small image

You're reading from  Generative AI with Python and TensorFlow 2

Product typeBook
Published inApr 2021
PublisherPackt
ISBN-139781800200883
Edition1st Edition
Right arrow

The Rise of Methods for Text Generation

In the preceding chapters, we discussed different methods and techniques to develop and train generative models. Particularly, in Chapter 6, Image Generation with GANs, we discussed the taxonomy of generative models and introduced explicit and implicit classes. Throughout this book, our focus has been on developing generative models in the vision space, utilizing image and video datasets. The advancements in the field of deep learning for computer vision and ease of understanding were the major reasons behind such a focused introduction.

In the past couple of years though, Natural Language Processing (NLP) or processing of textual data has seen great interest and research. Text is not just another unstructured type of data; there's a lot more to it than what meets the eye. Textual data is a representation of our thoughts, ideas, knowledge, and communication.

In this chapter and the next, we will focus on understanding concepts...

Representing text

Language is one of the most complex aspects of our existence. We use language to communicate our thoughts and choices. Every language is defined with a list of characters called the alphabet, a vocabulary, and a set of rules called grammar. Yet it is not a trivial task to understand and learn a language. Languages are complex and have fuzzy grammatical rules and structures.

Text is a representation of language that helps us communicate and share. This makes it a perfect area of research to expand the horizons of what artificial intelligence can achieve. Text is a type of unstructured data that cannot directly be used by any of the known algorithms. Machine learning and deep learning algorithms in general work with numbers, matrices, vectors, and so on. This, in turn, raises the question: how can we represent text for different language-related tasks?

Bag of Words

As we mentioned earlier, every language consists of a defined list of characters (alphabet...

Text generation and the magic of LSTMs

In the previous sections, we discussed different ways of representing textual data in order to make it fit for consumption by different NLP algorithms. In this section, we will leverage this understanding of text representation to work our way toward building text generation models.

So far, we have built models using feedforward networks consisting of different kinds and combinations of layers. These networks work with one training example at a time, which is independent of other training samples. We say that the samples are independent and identically distributed, or IID. Language, or text, is a bit different.

As we discussed in the previous sections, words change their meaning based on the context they are being used in. In other words, if we were to develop and train a language generation model, we would have to ensure the model understands the context of its input.

Recurrent Neural Networks (RNNs) are a class of neural networks...

LSTM variants and convolutions for text

RNNs are extremely useful when it comes to handling sequential datasets. We saw in the previous section how a simple model effectively learned to generate text based on what it learned from the training dataset.

Over the years, there have been a number of enhancements in the way we model and use RNNs. In this section, we will discuss two widely used variants of the single-layer LSTM network we discussed in the previous section: stacked and bidirectional LSTMs.

Stacked LSTMs

We are well aware of how the depth of a neural network helps it learn complex and abstract concepts when it comes to computer vision tasks. Along the same lines, a stacked LSTM architecture, which has multiple layers of LSTMs stacked one after the other, has been shown to give considerable improvements. Stacked LSTMs were first presented by Graves et al. in their work Speech Recognition with Deep Recurrent Neural Networks.6 They highlight the fact that depth...

Summary

Congratulations on completing a complex chapter involving a large number of concepts. In this chapter, we covered various concepts associated with handling textual data for the task of text generation. We started off by developing an understanding of different text representation models. We covered most of the widely used representation models, from Bag of Words to word2vec and even FastText.

The next section of the chapter focused on developing an understanding of RNN-based text generation models. We briefly discussed what comprises a language model and how we can prepare a dataset for such a task. We then trained a character-based language model to generate synthetic text samples. We touched upon different decoding strategies and used them to understand different outputs from our RNN based-language model. We also delved into a few variants, such as stacked LSTMs and bidirectional LSTM-based language models. Finally, we discussed the usage of convolutional networks in...

References

  1. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv. https://arxiv.org/abs/1301.3781
  2. Rumelhart, D.E., & McClelland, J.L. (1987). Distributed Representations, in Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations, pp.77-109. MIT Press. https://web.stanford.edu/~jlmcc/papers/PDP/Chapter3.pdf
  3. Pennington, J., Socher, R., & Manning, C.D. (2014). GloVe: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). https://nlp.stanford.edu/pubs/glove.pdf
  4. Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. arXiv. https://arxiv.org/abs/1607.04606
  5. Holtzman, A., Buys, J., Du, L., Forbes, M., & Choi, Y. (2019). The Curious Case of Neural Text Degeneration. arXiv. https://arxiv.org/abs/1904...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Generative AI with Python and TensorFlow 2
Published in: Apr 2021Publisher: PacktISBN-13: 9781800200883
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime