Reader small image

You're reading from  Recurrent Neural Networks with Python Quick Start Guide

Product typeBook
Published inNov 2018
Reading LevelIntermediate
PublisherPackt
ISBN-139781789132335
Edition1st Edition
Languages
Right arrow
Author (1)
Simeon Kostadinov
Simeon Kostadinov
author image
Simeon Kostadinov

Simeon Kostadinoff works for a startup called Speechify which aims to help people go through their readings faster by converting any text into speech. Simeon is Machine Learning enthusiast who writes a blog and works on various projects on the side. He enjoys reading different research papers and implement some of them in code. He was ranked number 1 in mathematics during his senior year of high school and thus he has deep passion about understanding how the deep learning models work under the hood. His specific knowledge in Recurrent Neural Networks comes from several courses that he has taken at Stanford University and University of Birmingham. They helped in understanding how to apply his theoretical knowledge into practice and build powerful models. In addition, he recently became a Stanford Scholar Initiative which includes working in a team of Machine Learning researchers on a specific deep learning research paper.
Read more about Simeon Kostadinov

Right arrow

Creating a Spanish-to-English Translator

This chapter will push your neural network knowledge even further by introducing state-of-the-art concepts at the core of today's most powerful language translation systems. You will build a simple version of a Spanish-to-English translator, which accepts a sentence in Spanish and outputs its English equivalent.

This chapter includes the following sections:

  • Understanding the translation model: This section is entirely focused on the theory behind this system.
  • What an LSTM network is: We'll be understanding what sits behind this advanced version of recurrent neural networks.
  • Understanding sequence-to-sequence network with attention: You will grasp the theory behind this powerful model, get to know what it actually does, and why it is so widely used for different problems.
  • Building the Spanish-to-English translator: This section...

Understanding the translation model

Machine translation is often done using so-called statistical machine translation, based on statistical models. This approach works very well, but a key issue is that, for every pair of languages, we need to rebuild the architecture. Thankfully, in 2014, Cho et al. (https://arxiv.org/pdf/1406.1078.pdf) came out with a paper that aims to solve this, and other problems, using the increasingly popular recurrent neural networks. The model is called sequence-to-sequence, and has the ability to be trained on any pair of languages by just providing the right amount of data. In addition, its power lies in its ability to match sequences of different lengths, such as in machine translation, where a sentence in English may have a different size when compared to a sentence in Spanish. Let's examine how these tasks...

What is an LSTM network?

LSTM (long short-term memory) network is an advanced RNN network that aims to solve the vanishing gradient problem and yield excellent results on longer sequences. In the previous chapter, we introduced the GRU network, which is a simpler version of LSTM. Both include memory states that determine what information should be propagated further at each timestep. The LSTM cell looks as follows:

Let's introduce the main equations that will clarify the preceding diagram. They are similar to the ones for gated recurrent units (see Chapter 3, Generating Your Own Book Chapter). Here is what happens at every given timestep, t:

 is the output gate, which determines what exactly is important for the current prediction and what information should be kept around for the future.  is called the input gate, and determines...

Understanding the sequence-to-sequence network with attention

Since you have already understood how the LSTM network works, let's take a step back and look at the full network architecture. As we said before, we are using a sequence-to-sequence model with an attention mechanism. This model consists of LSTM units grouped together, forming the encoder and decoder parts of the network. 

In a simple sequence-to-sequence model, we input a sentence of a given length and create a vector that captures all the information in that particular sentence. After that, we use the vector to predict the translation. You can read more about how this works in a wonderful Google paper (https://arxiv.org/pdf/1409.3215.pdfin the External links section at the end of this chapter

That approach is fine, but, as in every situation, we can and must do better. In that case...

Building the Spanish-to-English translator

I hope the previous sections left you with a good understanding of the model we are about to build. Now, we are going to get practical and write the code behind our translation system. We should end up with a trained network capable of predicting the English version of any sentence in Spanish. Let's dive into programming.

Preparing the data

The first step, as always, is to collect the needed data and prepare it for training. The more complicated our systems become, the more complex it is to massage the data and reform it into the right shape. We are going to use Spanish-to-English phrases from the OpenSubtitles free data source (http://opus.nlpl.eu/OpenSubtitles.php). We will...

Summary

This chapter walked you through building a fairly sophisticated neural network model using the sequence-to-sequence model implemented with the TensorFlow library.

First, you went through the theoretical part, gain an understanding of how the model works under the hood and why its application has resulted in remarkable achievements. In addition, you learned how an LSTM network works and why it is easily considered the best RNN model. 

Second, you saw how you can put the knowledge acquired here into practice using just several lines of code. In addition, you gain an understanding of how to prepare your data to fit the sequence-to-sequence model. Finally, you were able to successfully translate Spanish sentences into English. 

I really hope this chapter left you more confident in your deep learning knowledge and armed you with new skills that you can apply to future...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Recurrent Neural Networks with Python Quick Start Guide
Published in: Nov 2018Publisher: PacktISBN-13: 9781789132335
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Simeon Kostadinov

Simeon Kostadinoff works for a startup called Speechify which aims to help people go through their readings faster by converting any text into speech. Simeon is Machine Learning enthusiast who writes a blog and works on various projects on the side. He enjoys reading different research papers and implement some of them in code. He was ranked number 1 in mathematics during his senior year of high school and thus he has deep passion about understanding how the deep learning models work under the hood. His specific knowledge in Recurrent Neural Networks comes from several courses that he has taken at Stanford University and University of Birmingham. They helped in understanding how to apply his theoretical knowledge into practice and build powerful models. In addition, he recently became a Stanford Scholar Initiative which includes working in a team of Machine Learning researchers on a specific deep learning research paper.
Read more about Simeon Kostadinov