Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Hands-On Natural Language Processing with PyTorch 1.x

You're reading from  Hands-On Natural Language Processing with PyTorch 1.x

Product type Book
Published in Jul 2020
Publisher Packt
ISBN-13 9781789802740
Pages 276 pages
Edition 1st Edition
Languages
Author (1):
Thomas Dop Thomas Dop
Profile icon Thomas Dop

Table of Contents (14) Chapters

Preface Section 1: Essentials of PyTorch 1.x for NLP
Chapter 1: Fundamentals of Machine Learning and Deep Learning Chapter 2: Getting Started with PyTorch 1.x for NLP Section 2: Fundamentals of Natural Language Processing
Chapter 3: NLP and Text Embeddings Chapter 4: Text Preprocessing, Stemming, and Lemmatization Section 3: Real-World NLP Applications Using PyTorch 1.x
Chapter 5: Recurrent Neural Networks and Sentiment Analysis Chapter 6: Convolutional Neural Networks for Text Classification Chapter 7: Text Translation Using Sequence-to-Sequence Neural Networks Chapter 8: Building a Chatbot Using Attention-Based Neural Networks Chapter 9: The Road Ahead Other Books You May Enjoy

Building a sequence-to-sequence model for text translation

In order to build our sequence-to-sequence model for translation, we will implement the encoder/decoder framework we outlined previously. This will show how the two halves of our model can be utilized together in order to capture a representation of our data using the encoder and then translate this representation into another language using our decoder. In order to do this, we need to obtain our data.

Preparing the data

By now, we know enough about machine learning to know that for a task like this, we will need a set of training data with corresponding labels. In this case, we will need sentences in one language with the corresponding translations in another language. Fortunately, the Torchtext library that we used in the previous chapter contains a dataset that will allow us to get this.

The Multi30k dataset in Torchtext consists of approximately 30,000 sentences with corresponding translations in multiple languages...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}