You're reading from Transformers for Natural Language Processing - Second Edition

Product typeBook

Published inMar 2022

PublisherPackt

ISBN-139781803247335

Edition2nd Edition

Concepts

Mobile Application Development

Author (1)

Denis Rothman

Machine Translation with the Transformer

Humans master sequence transduction, transferring a representation to another object. We can easily imagine a mental representation of a sequence. If somebody says The flowers in my garden are beautiful, we can easily visualize a garden with flowers in it. We see images of the garden, although we might never have seen that garden. We might even imagine chirping birds and the scent of flowers.

A machine must learn transduction from scratch with numerical representations. Recurrent or convolutional approaches have produced interesting results but have not reached significant BLEU translation evaluation scores. Translating requires the representation of language A transposed into language B.

The transformer model’s self-attention innovation increases the analytic ability of machine intelligence. A sequence in language A is adequately represented before attempting to translate it into language B. Self-attention brings the level of...

Defining machine translation

Vaswani et al. (2017) tackled one of the most difficult NLP problems when designing the Transformer. The human baseline for machine translation seems out of reach for us human-machine intelligence designers. This did not stop Vaswani et al. (2017) from publishing the Transformer’s architecture and achieving state-of-the-art BLEU results.

In this section, we will define machine translation. Machine translation is the process of reproducing human translation by machine transductions and outputs:

Figure 6.1: Machine translation process

The general idea in Figure 6.1 is for the machine to do the following in a few steps:

Choose a sentence to translate
Learn how words relate to each other with hundreds of millions of parameters
Learn the many ways in which words refer to each other
Use machine transduction to transfer the learned parameters to new sequences
Choose a candidate translation for a word...

Preprocessing a WMT dataset

Vaswani et al. (2017) present the Transformer’s achievements on the WMT 2014 English-to-German translation task and the WMT 2014 English-to-French translation task. The Transformer achieves a state-of-the-art BLEU score. BLEU will be described in the Evaluating machine translation with BLEU section of this chapter.

The 2014 WMT contained several European language datasets. One of the datasets contained data taken from version 7 of the Europarl corpus. We will be using the French-English dataset from the European Parliament Proceedings Parallel Corpus, 1996-2011 (https://www.statmt.org/europarl/v7/fr-en.tgz).

Once you have downloaded the files and have extracted them, we will preprocess the two parallel files:

europarl-v7.fr-en.en
europarl-v7.fr-en.fr

We will load, clear, and reduce the size of the corpus.

Let’s start the preprocessing.

Preprocessing the raw data

In this section, we will preprocess...

Evaluating machine translation with BLEU

Papineni et al. (2002) came up with an efficient way to evaluate a human translation. The human baseline was difficult to define. However, they realized that we could obtain efficient results if we compared human translation with machine translation, word for word.

Papineni et al. (2002) named their method the Bilingual Evaluation Understudy Score (BLEU).

In this section, we will use the Natural Language Toolkit (NLTK) to implement BLEU:

http://www.nltk.org/api/nltk.translate.html#nltk.translate.bleu_score.sentence_bleu

We will begin with geometric evaluations.

Geometric evaluations

The BLEU method compares the parts of a candidate sentence to a reference sentence or several reference sentences.

Open BLEU.py, which is in the chapter directory of the GitHub repository of this book.

The program imports the nltk module:

from nltk.translate.bleu_score import sentence_bleu
from nltk.translate.bleu_score import...

Translation with Google Translate

Google Translate, https://translate.google.com/, provides a ready-to-use interface for translations. Google is progressively introducing a transformer encoder into its translation algorithms. In the following section, we will implement a transformer model for a translation task with Google Trax.

However, an AI specialist may not be required at all.

If we enter the sentence analyzed in the previous section in Google Translate, Levez-vous svp pour cette minute de silence, we obtain an English translation in real time:

Figure 6.2: Google Translate

The translation is correct.

Does Industry 4.0 still require AI specialists for translation tasks or simply a web interface developer?

Google provides every service required for translations on their Google Translate platform: https://cloud.google.com/translate:

A translation API: A web developer can create an interface for a customer
A media translation API that...

Translations with Trax

Google Brain developed Tensor2Tensor (T2T) to make deep learning development easier. T2T is an extension of TensorFlow and contains a library of deep learning models that contains many transformer examploes.

Although T2T was a good start, Google Brain then produced Trax, an end-to-end deep learning library. Trax contains a transformer model that can be applied to translations. The Google Brain team presently maintains Trax.

This section will focus on the minimum functions to initialize the English-German problem described by Vaswani et al. (2017) to illustrate the Transformer’s performance.

We will be using preprocessed English and German datasets to show that the Transformer architecture is language-agnostic.

Open Trax_Translation.ipynb.

We will begin by installing the modules we need.

Installing Trax

Google Brain has made Trax easy to install and run. We will import the basics along with Trax, which can be installed in one...

Summary

In this chapter, we went through three additional essential aspects of the original Transformer.

We started by defining machine translation. Human translation sets an extremely high baseline for machines to reach. We saw that English-French and English-German translations imply numerous problems to solve. The transformer tackled these problems and set state-of-the-art BLEU records to beat.

We then preprocessed a WMT French-English dataset from the European Parliament that required cleaning. We had to transform the datasets into lines and clean the data up. Once that was done, we reduced the dataset’s size by suppressing words that occurred below a frequency threshold.

Machine translation NLP models require identical evaluation methods. Training a model on a WMT dataset requires BLEU evaluations. We saw that geometric assessments are a good basis for scoring translations, but even modified BLEU has its limits. We thus added a smoothing technique to enhance...

Questions

Machine translation has now exceeded human baselines. (True/False)
Machine translation requires large datasets. (True/False)
There is no need to compare transformer models using the same datasets. (True/False)
BLEU is the French word for blue and is the acronym of an NLP metric (True/False)
Smoothing techniques enhance BERT. (True/False)
German-English is the same as English-German for machine translation. (True/False)
The original Transformer multi-head attention sub-layer has 2 heads. (True/False)
The original Transformer encoder has 6 layers. (True/False)
The original Transformer encoder has 6 layers but only 2 decoder layers. (True/False)
You can train transformers without decoders. (True/False)

References

English-German BLEU scores with reference papers and code: https://paperswithcode.com/sota/machine-translation-on-wmt2014-english-german
The 2014 Workshop on Machine Translation (WMT): https://www.statmt.org/wmt14/translation-task.html
European Parliament Proceedings Parallel Corpus 1996-2011, parallel corpus French-English: https://www.statmt.org/europarl/v7/fr-en.tgz
Jason Brownlee, Ph.D., How to Prepare a French-to-English Dataset for Machine Translation: https://machinelearningmastery.com/prepare-french-english-dataset-machine-translation/
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu, 2002, BLEU: a Method for Automatic Evaluation of Machine Translation: https://aclanthology.org/P02-1040.pdf
Jason Brownlee, Ph.D., A Gentle Introduction to Calculating the BLEU Score for Text in Python: https://machinelearningmastery.com/calculate-bleu-score-for-text-python/
Boxing Chen and Colin Cherry, 2014, A Systematic Comparison...

The rest of the chapter is locked

You have been reading a chapter from

Transformers for Natural Language Processing - Second Edition

Published in: Mar 2022Publisher: PacktISBN-13: 9781803247335

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Denis Rothman

Denis Rothman graduated from Sorbonne University and Paris-Diderot University, designing one of the very first word2matrix patented embedding and patented AI conversational agents. He began his career authoring one of the first AI cognitive Natural Language Processing (NLP) chatbots applied as an automated language teacher for Moet et Chandon and other companies. He authored an AI resource optimizer for IBM and apparel producers. He then authored an Advanced Planning and Scheduling (APS) solution used worldwide.
Read more about Denis Rothman

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages