Reader small image

You're reading from  Transformers for Natural Language Processing and Computer Vision - Third Edition

Product typeBook
Published inFeb 2024
Reading LevelN/a
PublisherPackt
ISBN-139781805128724
Edition3rd Edition
Languages
Tools
Right arrow
Author (1)
Denis Rothman
Denis Rothman
author image
Denis Rothman

Denis Rothman graduated from Sorbonne University and Paris-Diderot University, designing one of the very first word2matrix patented embedding and patented AI conversational agents. He began his career authoring one of the first AI cognitive Natural Language Processing (NLP) chatbots applied as an automated language teacher for Moet et Chandon and other companies. He authored an AI resource optimizer for IBM and apparel producers. He then authored an Advanced Planning and Scheduling (APS) solution used worldwide.
Read more about Denis Rothman

Right arrow

References

  • Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman, 2019, SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems: https://w4ngatang.github.io/static/papers/superglue.pdf
  • Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman, 2019, GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding: https://arxiv.org/abs/1804.07461
  • Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, Haifeng Wang, 2019, ERNIE 2.0: A Continual Pretraining Framework for Language Understanding: https://arxiv.org/pdf/1907.12412.pdf
  • Melissa Roemmele, Cosmin Adrian Bejan, and Andrew S. Gordon, 2011, Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning: https://people.ict.usc.edu/~gordon/publications/AAAI-SPRING11A.PDF
  • Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher...

Further Reading

You can examine many other large language model benchmarking approaches, including the following tools:

Ultimately, the decision to use a benchmarking framework depends on each project.

Evaluating machine translations

Vaswani et al. (2017) presented the Original Transformer’s achievements in the Workshop on Statistical Machine (WMT) 2014 English-to-German translation task and the WMT 2014 English-to-French translation task. The Original Transformer achieved a state-of-the-art BLEU score. BLEU will be described in the Evaluating machine translation with BLEU section of this chapter.

However, we must begin by preprocessing the WMT dataset we will examine.

Preprocessing a WMT dataset

The 2014 WMT contained several European language datasets. One dataset contained data from version 7 of the Europarl corpus. We will use the French-English dataset from the European Parliament Proceedings Parallel Corpus, 1996–2011 (https://www.statmt.org/europarl/v7/fr-en.tgz).

Open WMT-translations.ipynb, which is in the chapter directory of the GitHub repository.

The first step is to download the files we need:

import urllib.request
# Define the...

Translations with Google Trax

Google Brain developed Tensor2Tensor (T2T) to make deep learning development easier. T2T is an extension of TensorFlow and contains a library of deep learning models, including many transformer examples.

Although T2T was a good start, Google Brain then produced Trax, an end-to-end deep learning library. Trax contains a Transformer model that can be applied to translations. The Google Brain team presently maintains Trax.

This section will focus on the minimum functions to initialize the English-German problem described by Vaswani et al. (2017), illustrating the Original Transformer’s performance.

We will use preprocessed English and German datasets to show that the transformer architecture is language-agnostic.

Open Trax_Google_Translate.ipynb. We will begin by installing the modules we need.

Installing Trax

Google Brain has made Trax easy to install and run. We will import the basics along with Trax, which can be installed...

Translation with Google Translate

Google Translate (https://translate.google.com/) provides a ready-to-use official interface for translations. Google also possesses transformer technology in its translation algorithms.

However, an AI specialist may not be required at all.

If we enter the sentence analyzed in the previous section in Google Translate, Levez-vous svp pour cette minute de silence, we obtain an English translation in real time:

A screenshot of a phone  Description automatically generated with low confidence

Figure 4.2: Google Translate

The translation is correct.

Does the AI industry still require AI specialists for translation tasks or simply a web interface developer?

Google provides every service required for translations on their Google Translate platform: https://cloud.google.com/translate:

  • A translation API: A web developer can create an interface for a customer.
  • A media translation API that can translate your streaming content.
  • An AutoML translation service that will train a custom model for...

Translation with Gemini

How far can we go with Gemini for translations? We reviewed some of Gemini’s NLP abilities in Chapter 3, Emergent vs Downstream Tasks: The Unseen Depths of Transformers. Now, we will create a dialog with Gemini to explore its potential and limitations.

Go to https://gemini.google.com/ to start a dialog.

Needless to say, Gemini has been trained with large datasets in scores of languages.

We will begin with Gemini’s potential and then search for its limitations.

My prompt begins with Denis:

Gemini’s comprehensive response follows my prompt that ends with a question mark.

Gemini’s potential

Denis: Can you translate a sentence from English to French?
Yes, I can translate a sentence from English to French. For example, if you say "I am a large language model," the translation would be "Je suis un grand modèle linguistique."
Here are some other examples of English-to-French translations...

Summary

In this chapter, we explored some of the essential aspects of translations with transformers.

We started by defining machine translation. Human translation sets a high baseline for machines to reach. We saw that English-French and English-German translations imply numerous problems to solve. The Original Transformer tackled these problems and set state-of-the-art BLEU records to beat.

We then preprocessed a WMT French-English dataset from the European Parliament that required cleaning. We had to transform the datasets into lines and clean the data up. Once that was done, we reduced the dataset’s size by suppressing words that occurred below a frequency threshold.

Machine translation NLP models require identical evaluation methods. For example, training a model on a WMT dataset requires BLEU evaluations. We saw that geometric assessments are a good basis for scoring translations, but even modified BLEU has its limits. We thus added a smoothing technique to...

Questions

  1. Machine translation has now exceeded human baselines. (True/False)
  2. Machine translation requires large datasets. (True/False)
  3. There is no need to compare transformer models using the same datasets. (True/False)
  4. BLEU is the French word for blue and is the acronym of an NLP metric. (True/False)
  5. Smoothing techniques enhance BERT. (True/False)
  6. German-English is the same as English-German for machine translation. (True/False)
  7. The Original Transformer multi-head attention sub-layer has two heads. (True/False)
  8. The Original Transformer encoder has six layers. (True/False)
  9. The Original Transformer encoder has six layers but only two decoder layers. (True/False)
  10. You can train transformers without decoders. (True/False)

References

Further reading

  • Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu, 2002, BLEU: a Method for Automatic Evaluation of Machine Translation: https://aclanthology.org/P02-1040.pdf
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin, 2017, Attention Is All You Need: https://arxiv.org/abs/1706.03762

Join our community on Discord

Join our community’s Discord space for discussions with the authors and other readers:

https://www.packt.link/Transformers

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Transformers for Natural Language Processing and Computer Vision - Third Edition
Published in: Feb 2024Publisher: PacktISBN-13: 9781805128724
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Denis Rothman

Denis Rothman graduated from Sorbonne University and Paris-Diderot University, designing one of the very first word2matrix patented embedding and patented AI conversational agents. He began his career authoring one of the first AI cognitive Natural Language Processing (NLP) chatbots applied as an automated language teacher for Moet et Chandon and other companies. He authored an AI resource optimizer for IBM and apparel producers. He then authored an Advanced Planning and Scheduling (APS) solution used worldwide.
Read more about Denis Rothman