You're reading from Mastering Transformers

Product typeBook

Published inSep 2021

PublisherPackt

ISBN-139781801077651

Edition1st Edition

Tools

TensorFlow

Concepts

Mobile Application Development

Authors (2):

Savaş Yıldırım

Meysam Asgari- Chenaghlu

View More author details

Chapter 9:Cross-Lingual and Multilingual Language Modeling

Up to this point, you have learned a lot about transformer-based architectures, from encoder-only models to decoder-only models, from efficient transformers to long-context transformers. You also learned about semantic text representation based on a Siamese network. However, we discussed all these models in terms of monolingual problems. We assumed that these models just understand a single language and are not capable of having a general understanding of text, regardless of the language itself. In fact, some of these models have multilingual variants; Multilingual Bidirectional Encoder Representations from Transformers (mBERT), Multilingual Text-to-Text Transfer Transformer (mT5), and Multilingual Bidirectional and Auto-Regressive Transformer (mBART), to name but a few. On the other hand, some models are specifically designed for multilingual purposes trained with cross-lingual objectives. For example, Cross-lingual Language...

Technical requirements

The code for this chapter is found in the repo at https://github.com/PacktPublishing/Mastering-Transformers/tree/main/CH09, which is in the GitHub repository for this book. We will be using Jupyter Notebook to run our coding exercises that require Python 3.6.0+, and the following packages will need to be installed:

tensorflow
pytorch
transformers >=4.00
datasets
sentence-transformers
umap-learn
openpyxl

Check out the following link to see the Code in Action video:

https://bit.ly/3zASz7M

XLM and mBERT

We have picked two models to explain in this section: mBERT and XLM. We selected these models because they correspond to the two best multilingual types as of writing this article. mBERT is a multilingual model trained on a different corpus of various languages using MLM modeling. It can operate separately for many languages. On the other hand, XLM is trained on different corpora using MLM, CLM, and TLM language modeling, and can solve cross-lingual tasks. For instance, it can measure the similarity of the sentences in two different languages by mapping them in a common vector space, which is not possible with mBERT.

mBERT

You are familiar with the BERT autoencoder model from Chapter 3, Autoencoding Language Models, and how to train it using MLM on a specified corpus. Imagine a case where a wide and huge corpus is provided not from a single language, but from 104 languages instead. Training on such a corpus would result in a multilingual version of BERT. However...

Cross-lingual similarity tasks

Cross-lingual models are capable of representing text in a unified form, where sentences are from different languages but those with close meaning are mapped to similar vectors in vector space. XLM-R, as was detailed in the previous section, is one of the successful models in this scope. Now, let's look at some applications on this.

Cross-lingual text similarity

In the following example, you will see how it is possible to use a cross-lingual language model pre-trained on the XNLI dataset to find similar texts from different languages. A use-case scenario is where a plagiarism detection system is required for this task. We will use sentences from the Azerbaijani language and see whether XLM-R finds similar sentences from English—if there are any. The sentences from both languages are identical. Here are the steps to take:

First, you need to load a model for this task, as follows:
```
from sentence_transformers import SentenceTransformer...
```

Cross-lingual classification

So far, you have learned that cross-lingual models are capable of understanding different languages in semantic vector space where similar sentences, regardless of their language, are close in terms of vector distance. But how it is possible to use this capability in use cases where we have few samples available?

For example, you are trying to develop an intent classification for a chatbot in which there are few samples or no samples available for the second language; but for the first language—let's say English—you do have enough samples. In such cases, it is possible to freeze the cross-lingual model itself and just train a classifier for the task. A trained classifier can be tested on a second language instead of the language it is trained on.

In this section, you will learn how to train a cross-lingual model in English for text classification and test it in other languages. We have selected a very low-resource language known...

Cross-lingual zero-shot learning

In previous sections, you learned how to perform zero-shot text classification using monolingual models. Using XLM-R for multilingual and cross-lingual zero-shot classification is identical to the approach and code used previously, so we will use mT5 here.

mT5, which is a massively multilingual pre-trained language model, is based on the encoder-decoder architecture of Transformers and is also identical to T5. T5 is pre-trained on English and mT5 is trained on 101 languages from Multilingual Common Crawl (mC4).

The fine-tuned version of mT5 on the XNLI dataset is available from the HuggingFace repository (https://huggingface.co/alan-turing-institute/mt5-large-finetuned-mnli-xtreme-xnli).

The T5 model and its variant, mT5, is a completely text-to-text model, which means it will produce text for any task it is given, even if the task is classification or NLI. So, in the case of inferring this model, extra steps are required. We'll take the...

Fundamental limitations of multilingual models

Although the multilingual and cross-lingual models are promising and will affect the direction of NLP work, they still have some limitations. Many recent works addressed these limitations. Currently, the mBERT model slightly underperforms in many tasks compared with its monolingual counterparts and may not be a potential substitute for a well-trained monolingual model, which is why monolingual models are still widely used.

Studies in the field indicate that multilingual models suffer from the so-called curse of multilingualism as they seek to appropriately represent all languages. Adding new languages to a multilingual model improves its performance, up to a certain point. However, it is also seen that adding it after this point degrades performance, which may be due to shared vocabulary. Compared to monolingual models, multilingual models are significantly more limited in terms of the parameter budget. They need to allocate their vocabulary...

Summary

In this chapter, you learned about multilingual and cross-lingual language model pre-training and the difference between monolingual and multilingual pre-training. CLM and TLM were also covered, and you gained knowledge about them. You learned how it is possible to use cross-lingual models on various use cases, such as semantic search, plagiarism, and zero-shot text classification. You also learned how it is possible to train on a dataset from a language and test on a completely different language using cross-lingual models. Fine-tuning the performance of multilingual models was evaluated, and we concluded that some multilingual models can be a substitute for monolingual models, remarkably keeping performance loss to a minimum.

In the next chapter, you will learn how to deploy transformer models for real problems and train them for production at an industrial scale.

References

Conneau, A., Lample, G., Rinott, R., Williams, A., Bowman, S. R., Schwenk, H. and Stoyanov, V. (2018). XNLI: Evaluating cross-lingual sentence representations. arXiv preprint arXiv:1809.05053.
Xue, L., Constant, N., Roberts, A., Kale, M., Al-Rfou, R., Siddhant, A. and Raffel, C. (2020). mT5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934.
Lample, G. and Conneau, A. (2019). Cross-lingual language model pretraining. arXiv preprint arXiv:1901.07291.
Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F. and Stoyanov, V. (2019). Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116.
Feng, F., Yang, Y., Cer, D., Arivazhagan, N. and Wang, W. (2020). Language-agnostic bert sentence embedding. arXiv preprint arXiv:2007.01852.
Rust, P., Pfeiffer, J., Vulić, I., Ruder, S. and Gurevych, I. (2020). How Good is Your Tokenizer? On the Monolingual...

The rest of the chapter is locked

You have been reading a chapter from

Mastering Transformers

Published in: Sep 2021Publisher: PacktISBN-13: 9781801077651

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Savaş Yıldırım

Savaş Yıldırım graduated from the Istanbul Technical University Department of Computer Engineering and holds a Ph.D. degree in Natural Language Processing (NLP). Currently, he is an associate professor at the Istanbul Bilgi University, Turkey, and is a visiting researcher at the Ryerson University, Canada. He is a proactive lecturer and researcher with more than 20 years of experience teaching courses on machine learning, deep learning, and NLP. He has significantly contributed to the Turkish NLP community by developing a lot of open source software and resources. He also provides comprehensive consultancy to AI companies on their R&D projects. In his spare time, he writes and directs short films, and enjoys practicing yoga.
Read more about Savaş Yıldırım

Meysam Asgari- Chenaghlu

Meysam Asgari-Chenaghlu is an AI manager at Carbon Consulting and is also a Ph.D. candidate at the University of Tabriz. He has been a consultant for Turkey's leading telecommunication and banking companies. He has also worked on various projects, including natural language understanding and semantic search.
Read more about Meysam Asgari- Chenaghlu

Other recommended products

Related to this chapter

Getting Started with Google BERT

Getting Started with Google BERT will help you become well-versed with the BERT model from scratch and learn how to create interesting NLP applications. You'll understand several variants of BERT such as ALBERT, RoBERTa, DistilBERT, ELECTRA, VideoBERT, and many others in detail.

BookJan 2021352 pages

Transformers for Natural Language Processing

Being the first book in the market to dive deep into the Transformers, it is a step-by-step guide for data and AI practitioners to help enhance the performance of language understanding and gain expertise with hands-on implementation of transformers using PyTorch, TensorFlow, Hugging Face, Trax, and AllenNLP.

BookJan 2021384 pages

Advanced Natural Language Processing with TensorFlow 2

This book provides hands-on training in NLP tools and techniques with intrinsic details. Apart from gaining expertise, you will be able to carry out novel state-of-the-art research using the skills gained.

BookFeb 2021380 pages

Mastering spaCy

Using machine learning-based NLP models, you can speed up business processes, make more accurate predictions, and uncover new insights from your existing data, where spaCy, an advanced industrial-grade natural language processing library, can help. With this book, you'll learn how to use it and create high-impact ML solutions for NLP.

BookJul 2021356 pages

Python Natural Language Processing Cookbook

Leverage your natural language processing skills to make sense of text. With this book, you'll learn fundamental and advanced NLP techniques in Python that will help you to make your data fit for application in a wide variety of industries. You’ll also find recipes for overcoming common challenges in implementing NLP pipelines.

BookMar 2021284 pages

Advanced Deep Learning with Python

This book is an expert-level guide to master the neural network variants using the Python ecosystem. You will gain the skills to build smarter, faster, and efficient deep learning systems with practical examples. By the end of this book, you will be up to date with the latest advances and current researches in the deep learning domain.

BookDec 2019468 pages

Deep Learning with TensorFlow 2 and Keras

Deep Learning with TensorFlow 2 and Keras, Second Edition teaches deep learning techniques alongside TensorFlow (TF) and Keras. The book introduces neural networks with TensorFlow, runs through the main applications, covers two working example apps, and then dives into TF and cloudin production, TF mobile, and using TensorFlow with AutoML.

BookDec 2019646 pages

Generative AI with Python and TensorFlow 2

Packed with intriguing real-world projects as well as theory, Generative AI with Python and TensorFlow 2 enables you to leverage artificial intelligence creatively and generate human-like data in the form of speech, text, images, and music.

BookApr 2021488 pages4

Hands-On Natural Language Processing with Python

This book teaches you to leverage deep learning models in performing various NLP tasks along with showcasing the best practices in dealing with the NLP challenges. The book equips you with practical knowledge to implement deep learning in your linguistic applications using NLTk and Python's popular deep learning library, TensorFlow.

BookJul 2018312 pages

Hands-On Natural Language Processing with PyTorch 1.x

Developers working with NLP will be able to put their knowledge to work with this practical guide to PyTorch. You will learn to use PyTorch offerings and how to understand and analyze text using Python. You will learn to extract the underlying meaning in the text using deep neural networks and modern deep learning algorithms.

BookJul 2020276 pages

Hands-On Python Natural Language Processing

This book provides a blend of both the theoretical and practical aspects of Natural Language Processing (NLP). It covers the concepts essential to develop a thorough understanding of NLP and also delves into a detailed discussion on NLP based use-cases such as language translation, sentiment analysis, etc. Every module covers real-world examples

BookJun 2020316 pages4

Mastering PyTorch

Discover the flexibility of the PyTorch library for implementing new algorithms in a scalable and efficient way with this expert guide. This book will show you how to process data with deep learning methodologies using PyTorch 1.x and cover advanced topics such as GANs, Deep RL, and NLP using advanced deep learning techniques.

BookFeb 2021450 pages

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages

You're reading from Mastering Transformers

Chapter 9:Cross-Lingual and Multilingual Language Modeling

Technical requirements

Translation language modeling and cross-lingual knowledge sharing

XLM and mBERT

mBERT

Cross-lingual similarity tasks

Cross-lingual text similarity

Cross-lingual classification

Cross-lingual zero-shot learning

Fundamental limitations of multilingual models

Summary

References

Unlock this book and the full library FREE for 7 days

Authors (2)

Getting Started with Google BERT

Getting Started with Google BERT will help you become well-versed with the BERT model from scratch and learn how to create interesting NLP applications. You'll understand several variants of BERT such as ALBERT, RoBERTa, DistilBERT, ELECTRA, VideoBERT, and many others in detail.

Transformers for Natural Language Processing

Advanced Natural Language Processing with TensorFlow 2

This book provides hands-on training in NLP tools and techniques with intrinsic details. Apart from gaining expertise, you will be able to carry out novel state-of-the-art research using the skills gained.

Mastering spaCy

Python Natural Language Processing Cookbook

Advanced Deep Learning with Python

Deep Learning with TensorFlow 2 and Keras

Generative AI with Python and TensorFlow 2

Packed with intriguing real-world projects as well as theory, Generative AI with Python and TensorFlow 2 enables you to leverage artificial intelligence creatively and generate human-like data in the form of speech, text, images, and music.

Hands-On Natural Language Processing with Python

Hands-On Natural Language Processing with PyTorch 1.x

Hands-On Python Natural Language Processing

Mastering PyTorch

Et al.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

Mastering Tableau 2023

Building AI Applications with ChatGPT APIs

Building AI Applications with ChatGPT APIs

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

Modern Data Architecture on AWS

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

TinyML Cookbook