You're reading from Hands-On Natural Language Processing with PyTorch 1.x

Product typeBook

Published inJul 2020

Reading LevelBeginner

PublisherPackt

ISBN-139781789802740

Edition1st Edition

Languages

Python

Tools

PyTorch

Concepts

Mobile Application Development

Author (1)

Thomas Dop

Chapter 7: Text Translation Using Sequence-to-Sequence Neural Networks

In the previous two chapters, we used neural networks to classify text and perform sentiment analysis. Both tasks involve taking an NLP input and predicting some value. In the case of our sentiment analysis, this was a number between 0 and 1 representing the sentiment of our sentence. In the case of our sentence classification model, our output was a multi-class prediction, of which there were several categories our sentence belonged to. But what if we wish to make not just a single prediction, but predict a whole sentence? In this chapter, we will build a sequence-to-sequence model that takes a sentence in one language as input and outputs the translation of this sentence in another language.

We have already explored several types of neural network architecture used for NLP learning, namely recurrent neural networks in Chapter 5, Recurrent Neural Networks and Sentiment Analysis, and convolutional neural networks...

Technical requirements

All the code for this chapter can be found at https://github.com/PacktPublishing/Hands-On-Natural-Language-Processing-with-PyTorch-1.x.

Theory of sequence-to-sequence models

Sequence-to-sequence models are very similar to the conventional neural network structures we have seen so far. The main difference is that for a model's output, we expect another sequence, rather than a binary or multi-class prediction. This is particularly useful in tasks such as translation, where we may wish to convert a whole sentence into another language.

In the following example, we can see that our English-to-Spanish translation maps word to word:

Figure 7.1 – English to Spanish translation

The first word in our input sentence maps nicely to the first word in our output sentence. If this were the case for all languages, we could simply pass each word in our sentence one by one through our trained model to get an output sentence, and there would be no need for any sequence-to-sequence modeling, as shown here:

Figure 7.2 – English-to-Spanish translation of words

...

Building a sequence-to-sequence model for text translation

In order to build our sequence-to-sequence model for translation, we will implement the encoder/decoder framework we outlined previously. This will show how the two halves of our model can be utilized together in order to capture a representation of our data using the encoder and then translate this representation into another language using our decoder. In order to do this, we need to obtain our data.

Preparing the data

By now, we know enough about machine learning to know that for a task like this, we will need a set of training data with corresponding labels. In this case, we will need sentences in one language with the corresponding translations in another language. Fortunately, the Torchtext library that we used in the previous chapter contains a dataset that will allow us to get this.

The Multi30k dataset in Torchtext consists of approximately 30,000 sentences with corresponding translations in multiple languages...

Next steps

While we have shown our sequence-to-sequence model to be effective at performing language translation, the model we trained from scratch is not a perfect translator by any means. This is, in part, due to the relatively small size of our training data. We trained our model on a set of 30,000 English/German sentences. While this might seem very large, in order to train a perfect model, we would require a training set that's several orders of magnitude larger.

In theory, we would require several examples of each word in the entire English and German languages for our model to truly understand its context and meaning. For context, the 30,000 English sentences in our training set consisted of just 6,000 unique words. The average vocabulary of an English speaker is said to be between 20,000 and 30,000 words, which gives us an idea of just how many examples sentences we would need to train a model that performs perfectly. This is probably why the most accurate translation...

Summary

In this chapter, we covered how to build sequence-to-sequence models from scratch. We learned how to code up our encoder and decoder components individually and how to integrate them into a single model that is able to translate sentences from one language into another.

Although our sequence-to-sequence model, which consists of an encoder and a decoder, is useful for sequence translation, it is no longer state-of-the-art. In the last few years, combining sequence-to-sequence models with attention models has been done to achieve state-of-the-art performance.

In the next chapter, we will discuss how attention networks can be used in the context of sequence-to-sequence learning and show how we can use both techniques to build a chat bot.

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Natural Language Processing with PyTorch 1.x

Published in: Jul 2020Publisher: PacktISBN-13: 9781789802740

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Thomas Dop

Thomas Dop is a data scientist at MagicLab, a company that creates leading dating apps, including Bumble and Badoo. He works on a variety of areas within data science, including NLP, deep learning, computer vision, and predictive modeling. He holds an MSc in data science from the University of Amsterdam.
Read more about Thomas Dop

Other recommended products

Related to this chapter

Getting Started with Google BERT

Getting Started with Google BERT will help you become well-versed with the BERT model from scratch and learn how to create interesting NLP applications. You'll understand several variants of BERT such as ALBERT, RoBERTa, DistilBERT, ELECTRA, VideoBERT, and many others in detail.

BookJan 2021352 pages

Hands-On Python Natural Language Processing

This book provides a blend of both the theoretical and practical aspects of Natural Language Processing (NLP). It covers the concepts essential to develop a thorough understanding of NLP and also delves into a detailed discussion on NLP based use-cases such as language translation, sentiment analysis, etc. Every module covers real-world examples

BookJun 2020316 pages4

Hands-On Deep Learning Algorithms with Python

This book introduces basic-to-advanced deep learning algorithms used in a production environment by AI researchers and principal data scientists; it explains algorithms intuitively, including the underlying math, and shows how to implement them using popular Python-based deep learning libraries such as TensorFlow.

BookJul 2019512 pages

Hands-On Natural Language Processing with Python

This book teaches you to leverage deep learning models in performing various NLP tasks along with showcasing the best practices in dealing with the NLP challenges. The book equips you with practical knowledge to implement deep learning in your linguistic applications using NLTk and Python's popular deep learning library, TensorFlow.

BookJul 2018312 pages

Transformers for Natural Language Processing

Being the first book in the market to dive deep into the Transformers, it is a step-by-step guide for data and AI practitioners to help enhance the performance of language understanding and gain expertise with hands-on implementation of transformers using PyTorch, TensorFlow, Hugging Face, Trax, and AllenNLP.

BookJan 2021384 pages

Python Natural Language Processing Cookbook

Leverage your natural language processing skills to make sense of text. With this book, you'll learn fundamental and advanced NLP techniques in Python that will help you to make your data fit for application in a wide variety of industries. You’ll also find recipes for overcoming common challenges in implementing NLP pipelines.

BookMar 2021284 pages

Advanced Natural Language Processing with TensorFlow 2

This book provides hands-on training in NLP tools and techniques with intrinsic details. Apart from gaining expertise, you will be able to carry out novel state-of-the-art research using the skills gained.

BookFeb 2021380 pages

Intelligent Projects Using Python

This book includes 9 projects on building smart and practical AI-based systems. These projects cover solutions to different domain-specific problems in healthcare, e-commerce and more. With this book, you will apply different machine learning and deep learning techniques and learn how to build your own intelligent applications for smart predictions and other insight-driven tasks.

BookJan 2019342 pages

Natural Language Processing with TensorFlow

TensorFlow is the leading framework for deep learning algorithms critical to artificial intelligence, and natural language processing (NLP) makes much of the data used by deep learning applications accessible to them. This book brings the two together and teaches deep learning developers how to work with today’s vast amount of unstructured data.

BookMay 2018472 pages

Mastering PyTorch

Discover the flexibility of the PyTorch library for implementing new algorithms in a scalable and efficient way with this expert guide. This book will show you how to process data with deep learning methodologies using PyTorch 1.x and cover advanced topics such as GANs, Deep RL, and NLP using advanced deep learning techniques.

BookFeb 2021450 pages

Natural Language Processing and Computational Linguistics

Discover how you can perform your own modern text analysis, to make predictions, create inferences, and gain insights about the data around you today. Learn how to harness the powerful Python ecosystem and tools such as spaCy and Gensim to perform natural language processing, and computational linguistics algorithms.

BookJun 2018306 pages

Deep Learning with Microsoft Cognitive Toolkit Quick Start Guide

Cognitive Toolkit is one of the most popular and recently open sourced deep learning toolkit by Microsoft. Cognitive Toolkit is used to train fast and effective deep learning models. This book will be a quick introduction to using Cognitive Toolkit and will teach you how to train and validate different types of neural networks.

BookMar 2019208 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages