You're reading from Transformers for Natural Language Processing and Computer Vision - Third Edition

Product typeBook

Published inFeb 2024

Reading LevelN/a

PublisherPackt

ISBN-139781805128724

Edition3rd Edition

Languages

Python

Tools

PyTorch

Concepts

Deep Learning

Author (1)

Denis Rothman

References

Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman, 2019, SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems: https://w4ngatang.github.io/static/papers/superglue.pdf
Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman, 2019, GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding: https://arxiv.org/abs/1804.07461
Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu, Haifeng Wang, 2019, ERNIE 2.0: A Continual Pretraining Framework for Language Understanding: https://arxiv.org/pdf/1907.12412.pdf
Melissa Roemmele, Cosmin Adrian Bejan, and Andrew S. Gordon, 2011, Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning: https://people.ict.usc.edu/~gordon/publications/AAAI-SPRING11A.PDF
Richard Socher, Alex Perelygin, Jean Y. Wu, Jason Chuang, Christopher...

Evaluating machine translations

Vaswani et al. (2017) presented the Original Transformer’s achievements in the Workshop on Statistical Machine (WMT) 2014 English-to-German translation task and the WMT 2014 English-to-French translation task. The Original Transformer achieved a state-of-the-art BLEU score. BLEU will be described in the Evaluating machine translation with BLEU section of this chapter.

However, we must begin by preprocessing the WMT dataset we will examine.

Preprocessing a WMT dataset

The 2014 WMT contained several European language datasets. One dataset contained data from version 7 of the Europarl corpus. We will use the French-English dataset from the European Parliament Proceedings Parallel Corpus, 1996–2011 (https://www.statmt.org/europarl/v7/fr-en.tgz).

Open WMT-translations.ipynb, which is in the chapter directory of the GitHub repository.

The first step is to download the files we need:

import urllib.request
# Define the...

Translations with Google Trax

Google Brain developed Tensor2Tensor (T2T) to make deep learning development easier. T2T is an extension of TensorFlow and contains a library of deep learning models, including many transformer examples.

Although T2T was a good start, Google Brain then produced Trax, an end-to-end deep learning library. Trax contains a Transformer model that can be applied to translations. The Google Brain team presently maintains Trax.

This section will focus on the minimum functions to initialize the English-German problem described by Vaswani et al. (2017), illustrating the Original Transformer’s performance.

We will use preprocessed English and German datasets to show that the transformer architecture is language-agnostic.

Open Trax_Google_Translate.ipynb. We will begin by installing the modules we need.

Installing Trax

Google Brain has made Trax easy to install and run. We will import the basics along with Trax, which can be installed...

Translation with Google Translate

Google Translate (https://translate.google.com/) provides a ready-to-use official interface for translations. Google also possesses transformer technology in its translation algorithms.

However, an AI specialist may not be required at all.

If we enter the sentence analyzed in the previous section in Google Translate, Levez-vous svp pour cette minute de silence, we obtain an English translation in real time:

A screenshot of a phone Description automatically generated with low confidence

Figure 4.2: Google Translate

The translation is correct.

Does the AI industry still require AI specialists for translation tasks or simply a web interface developer?

Google provides every service required for translations on their Google Translate platform: https://cloud.google.com/translate:

A translation API: A web developer can create an interface for a customer.
A media translation API that can translate your streaming content.
An AutoML translation service that will train a custom model for...

Translation with Gemini

How far can we go with Gemini for translations? We reviewed some of Gemini’s NLP abilities in Chapter 3, Emergent vs Downstream Tasks: The Unseen Depths of Transformers. Now, we will create a dialog with Gemini to explore its potential and limitations.

Go to https://gemini.google.com/ to start a dialog.

Needless to say, Gemini has been trained with large datasets in scores of languages.

We will begin with Gemini’s potential and then search for its limitations.

My prompt begins with Denis:

Gemini’s comprehensive response follows my prompt that ends with a question mark.

Gemini’s potential

Denis: Can you translate a sentence from English to French?
Yes, I can translate a sentence from English to French. For example, if you say "I am a large language model," the translation would be "Je suis un grand modèle linguistique."
Here are some other examples of English-to-French translations...

Summary

In this chapter, we explored some of the essential aspects of translations with transformers.

We started by defining machine translation. Human translation sets a high baseline for machines to reach. We saw that English-French and English-German translations imply numerous problems to solve. The Original Transformer tackled these problems and set state-of-the-art BLEU records to beat.

We then preprocessed a WMT French-English dataset from the European Parliament that required cleaning. We had to transform the datasets into lines and clean the data up. Once that was done, we reduced the dataset’s size by suppressing words that occurred below a frequency threshold.

Machine translation NLP models require identical evaluation methods. For example, training a model on a WMT dataset requires BLEU evaluations. We saw that geometric assessments are a good basis for scoring translations, but even modified BLEU has its limits. We thus added a smoothing technique to...

Questions

Machine translation has now exceeded human baselines. (True/False)
Machine translation requires large datasets. (True/False)
There is no need to compare transformer models using the same datasets. (True/False)
BLEU is the French word for blue and is the acronym of an NLP metric. (True/False)
Smoothing techniques enhance BERT. (True/False)
German-English is the same as English-German for machine translation. (True/False)
The Original Transformer multi-head attention sub-layer has two heads. (True/False)
The Original Transformer encoder has six layers. (True/False)
The Original Transformer encoder has six layers but only two decoder layers. (True/False)
You can train transformers without decoders. (True/False)

References

English-German BLEU scores with reference papers and code: https://paperswithcode.com/sota/machine-translation-on-wmt2014-english-german
The 2014 Workshop on Machine Translation (WMT): https://www.statmt.org/wmt14/translation-task.html
European Parliament Proceedings Parallel Corpus 1996-2011, parallel corpus French-English: https://www.statmt.org/europarl/v7/fr-en.tgz
Jason Brownlee, Ph.D., How to Prepare a French-to-English Dataset for Machine Translation: https://machinelearningmastery.com/prepare-french-english-dataset-machine-translation/
Jason Brownlee, Ph.D., A Gentle Introduction to Calculating the BLEU Score for Text in Python: https://machinelearningmastery.com/calculate-bleu-score-for-text-python/
Boxing Chen and Colin Cherry, 2014, A Systematic Comparison of Smoothing Techniques for Sentence-Level BLEU: http://acl2014.org/acl2014/W14-33/pdf/W14-3346.pdf
Trax’s repository: https://github.com/google/trax

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu, 2002, BLEU: a Method for Automatic Evaluation of Machine Translation: https://aclanthology.org/P02-1040.pdf
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin, 2017, Attention Is All You Need: https://arxiv.org/abs/1706.03762

Join our community on Discord

Join our community’s Discord space for discussions with the authors and other readers:

https://www.packt.link/Transformers

The rest of the chapter is locked

You have been reading a chapter from

Transformers for Natural Language Processing and Computer Vision - Third Edition

Published in: Feb 2024Publisher: PacktISBN-13: 9781805128724

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Author (1)

Denis Rothman

Denis Rothman graduated from Sorbonne University and Paris-Diderot University, designing one of the very first word2matrix patented embedding and patented AI conversational agents. He began his career authoring one of the first AI cognitive Natural Language Processing (NLP) chatbots applied as an automated language teacher for Moet et Chandon and other companies. He authored an AI resource optimizer for IBM and apparel producers. He then authored an Advanced Planning and Scheduling (APS) solution used worldwide.
Read more about Denis Rothman

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Transformers for Natural Language Processing and Computer Vision - Third Edition

References

Further Reading

Evaluating machine translations

Preprocessing a WMT dataset

Translations with Google Trax

Installing Trax

Translation with Google Translate

Translation with Gemini

Gemini’s potential

Summary

Questions

References

Further reading

Join our community on Discord

Author (1)

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook