You're reading from Recurrent Neural Networks with Python Quick Start Guide

Product typeBook

Published inNov 2018

Reading LevelIntermediate

PublisherPackt

ISBN-139781789132335

Edition1st Edition

Languages

Python

Tools

TensorFlow

Concepts

Neural Networks

Author (1)

Simeon Kostadinov

Creating a Spanish-to-English Translator

This chapter will push your neural network knowledge even further by introducing state-of-the-art concepts at the core of today's most powerful language translation systems. You will build a simple version of a Spanish-to-English translator, which accepts a sentence in Spanish and outputs its English equivalent.

This chapter includes the following sections:

Understanding the translation model: This section is entirely focused on the theory behind this system.
What an LSTM network is: We'll be understanding what sits behind this advanced version of recurrent neural networks.
Understanding sequence-to-sequence network with attention: You will grasp the theory behind this powerful model, get to know what it actually does, and why it is so widely used for different problems.
Building the Spanish-to-English translator: This section...

Understanding the translation model

Machine translation is often done using so-called statistical machine translation, based on statistical models. This approach works very well, but a key issue is that, for every pair of languages, we need to rebuild the architecture. Thankfully, in 2014, Cho et al. (https://arxiv.org/pdf/1406.1078.pdf) came out with a paper that aims to solve this, and other problems, using the increasingly popular recurrent neural networks. The model is called sequence-to-sequence, and has the ability to be trained on any pair of languages by just providing the right amount of data. In addition,its power lies in its ability to match sequences of different lengths, such as in machine translation, where a sentence in English may have a different size when compared to a sentence in Spanish. Let's examine how these tasks...

What is an LSTM network?

LSTM (long short-term memory) network is an advanced RNN network that aims to solve the vanishing gradient problem and yield excellent results on longer sequences. In the previous chapter, we introduced the GRU network, which is a simpler version of LSTM. Both include memory states that determine what information should be propagated further at each timestep. The LSTM cell looks as follows:

Let's introduce the main equations that will clarify the preceding diagram. They are similar to the ones for gated recurrent units (see Chapter 3, Generating Your Own Book Chapter). Here is what happens at every given timestep, t:

is the output gate, which determines what exactly is important for the current prediction and what information should be kept around for the future. is called the input gate, and determines...

Understanding the sequence-to-sequence network with attention

Since you have already understood how the LSTM network works, let's take a step back and look at the full network architecture. As we said before, we are using a sequence-to-sequence model with an attention mechanism. This model consists of LSTM units grouped together, forming the encoder and decoder parts of the network.

In a simple sequence-to-sequence model, we input a sentence of a given length and create a vector that captures all the information in that particular sentence. After that, we use the vector to predict the translation. You can read more about how this works in a wonderful Google paper (https://arxiv.org/pdf/1409.3215.pdf) in the External links section at the end of this chapter.

That approach is fine, but, as in every situation, we can and must do better. In that case...

Building the Spanish-to-English translator

I hope the previous sections left you with a good understanding of the model we are about to build. Now, we are going to get practical and write the code behind our translation system. We should end up with a trained network capable of predicting the English version of any sentence in Spanish. Let's dive into programming.

Preparing the data

The first step, as always, is to collect the needed data and prepare it for training. The more complicated our systems become, the more complex it is to massage the data and reform it into the right shape. We are going to use Spanish-to-English phrases from the OpenSubtitles free data source (http://opus.nlpl.eu/OpenSubtitles.php). We will...

Summary

This chapter walked you through building a fairly sophisticated neural network model using the sequence-to-sequence model implemented with the TensorFlow library.

First, you went through the theoretical part, gain an understanding of how the model works under the hood and why its application has resulted in remarkable achievements. In addition, you learned how an LSTM network works and why it is easily considered the best RNN model.

Second, you saw how you can put the knowledge acquired here into practice using just several lines of code. In addition, you gain an understanding of how to prepare your data to fit the sequence-to-sequence model. Finally, you were able to successfully translate Spanish sentences into English.

I really hope this chapter left you more confident in your deep learning knowledge and armed you with new skills that you can apply to future...

External links

Sequence to Sequence model (Cho et al. 2014): https://arxiv.org/pdf/1406.1078.pdf
Understanding LSTM Network: http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Stanford University lecture on LSTM: https://www.youtube.com/watch?v=QuELiw8tbx8
Sequence to sequence learning using Neural Network: https://arxiv.org/pdf/1409.3215.pdf
WildML article on Attention and Memory in Deep Learning and NLP: http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/
OpenSubtitles: http://opus.nlpl.eu/OpenSubtitles.php
tf.contrib.legacy_seq2seq.embedding_attention_seq2seq: https://www.tensorflow.org/api_docs/python/tf/contrib/legacy_seq2seq/embedding_attention_seq2seq
tf.nn.sampled_softmax_loss: https://www.tensorflow.org/api_docs/python/tf/nn/sampled_softmax_loss
BLEU score: https://www.youtube.com/watch?v=DejHQYAGb7Q...

The rest of the chapter is locked

You have been reading a chapter from

Recurrent Neural Networks with Python Quick Start Guide

Published in: Nov 2018Publisher: PacktISBN-13: 9781789132335

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Simeon Kostadinov

Simeon Kostadinoff works for a startup called Speechify which aims to help people go through their readings faster by converting any text into speech. Simeon is Machine Learning enthusiast who writes a blog and works on various projects on the side. He enjoys reading different research papers and implement some of them in code. He was ranked number 1 in mathematics during his senior year of high school and thus he has deep passion about understanding how the deep learning models work under the hood. His specific knowledge in Recurrent Neural Networks comes from several courses that he has taken at Stanford University and University of Birmingham. They helped in understanding how to apply his theoretical knowledge into practice and build powerful models. In addition, he recently became a Stanford Scholar Initiative which includes working in a team of Machine Learning researchers on a specific deep learning research paper.
Read more about Simeon Kostadinov

Other recommended products

Related to this chapter

Hands-On Deep Learning with TensorFlow

With deep learning going mainstream, making sense of data and getting accurate results using deep networks is possible. Dan Van Boxel is your guide to exploring the possibilities with deep learning; he will enable you to understand data like never before. With the efficiency and simplicity of TensorFlow, you will be able to process your data and gain insights that will change how you look at data.

BookJul 2017174 pages

Deep Learning with Microsoft Cognitive Toolkit Quick Start Guide

Cognitive Toolkit is one of the most popular and recently open sourced deep learning toolkit by Microsoft. Cognitive Toolkit is used to train fast and effective deep learning models. This book will be a quick introduction to using Cognitive Toolkit and will teach you how to train and validate different types of neural networks.

BookMar 2019208 pages

Deep Learning with Theano

This book covers a complete overview of Deep Learning with Theano, a Python-based library that makes optimizing numerical expressions easy. Practical code examples address supervised, unsupervised, generative and reinforcement learning for image recognition, natural language processing, or game strategy, with best performing nets and principles.

BookJul 2017300 pages

Natural Language Processing with TensorFlow

TensorFlow is the leading framework for deep learning algorithms critical to artificial intelligence, and natural language processing (NLP) makes much of the data used by deep learning applications accessible to them. This book brings the two together and teaches deep learning developers how to work with today’s vast amount of unstructured data.

BookMay 2018472 pages

Deep Learning with Hadoop

BookFeb 2017206 pages

Mastering TensorFlow 1.x

We cover advanced deep learning concepts (such as transfer learning, generative adversarial models, and reinforcement learning), and implement them using TensorFlow and Keras. We cover how to build and deploy at scale with distributed models. You will learn to build TensorFlow models using R, Keras, TensorFlow Learn, TensorFlow Slim and Sonnet

BookJan 2018474 pages

Deep Learning for Natural Language Processing

Starting with the basics, this book teaches you how to choose from the various text pre-processing techniques and select the best model from the several neural network architectures for NLP issues.

BookJun 2019372 pages

Neural Network Programming with Tensorflow

If you’re aware of the buzz surrounding the terms such as machine learning, artificial intelligence or deep learning, you might know what neural networks are. TensorFlow is a popular framework which can be used to implement efficient neural networks and deep learning models. This book will show you how to leverage the power of TensorFlow to train efficient neural networks. You will start with understanding the fundamentals and basic math for neural networks and why TensorFlow is a popular choice of tool for programming neural networks. During the course of the book, you will be working on real-world datasets to get a hands-on understanding of neural network programming. By the end of this book, you will have a fair understanding of how you can leverage the power of TensorFlow to train neural networks of varying complexities, without any hassle. While you are learning about various neural network implementations you will learn the underlying mathematics and linear algebra and how it maps to the appropriate TensorFlow constructs.

BookNov 2017274 pages

Neural Networks with Keras Cookbook

This book presents solutions to the majority of the challenges you will face while training neural networks to solve deep learning problems. It covers the trending deep learning architectures used in industry and tackles a variety of use cases in computer vision, text processing, audio analysis, recommender systems, and game bots

BookFeb 2019568 pages

Hands-On Natural Language Processing with PyTorch 1.x

Developers working with NLP will be able to put their knowledge to work with this practical guide to PyTorch. You will learn to use PyTorch offerings and how to understand and analyze text using Python. You will learn to extract the underlying meaning in the text using deep neural networks and modern deep learning algorithms.

BookJul 2020276 pages

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

Hands-On Natural Language Processing with Python

This book teaches you to leverage deep learning models in performing various NLP tasks along with showcasing the best practices in dealing with the NLP challenges. The book equips you with practical knowledge to implement deep learning in your linguistic applications using NLTk and Python's popular deep learning library, TensorFlow.

BookJul 2018312 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages