Reader small image

You're reading from  Mastering Transformers

Product typeBook
Published inSep 2021
PublisherPackt
ISBN-139781801077651
Edition1st Edition
Right arrow
Authors (2):
Savaş Yıldırım
Savaş Yıldırım
author image
Savaş Yıldırım

Savaş Yıldırım graduated from the Istanbul Technical University Department of Computer Engineering and holds a Ph.D. degree in Natural Language Processing (NLP). Currently, he is an associate professor at the Istanbul Bilgi University, Turkey, and is a visiting researcher at the Ryerson University, Canada. He is a proactive lecturer and researcher with more than 20 years of experience teaching courses on machine learning, deep learning, and NLP. He has significantly contributed to the Turkish NLP community by developing a lot of open source software and resources. He also provides comprehensive consultancy to AI companies on their R&D projects. In his spare time, he writes and directs short films, and enjoys practicing yoga.
Read more about Savaş Yıldırım

Meysam Asgari- Chenaghlu
Meysam Asgari- Chenaghlu
author image
Meysam Asgari- Chenaghlu

Meysam Asgari-Chenaghlu is an AI manager at Carbon Consulting and is also a Ph.D. candidate at the University of Tabriz. He has been a consultant for Turkey's leading telecommunication and banking companies. He has also worked on various projects, including natural language understanding and semantic search.
Read more about Meysam Asgari- Chenaghlu

View More author details
Right arrow

Chapter 2: A Hands-On Introduction to the Subject

So far, we have had an overall look at the evolution of Natural Language Processing (NLP) using Deep Learning (DL)-based methods. We have learned some basic information about Transformer and their respective architecture. In this chapter, we are going to have a deeper look into how a transformer model can be used. Tokenizers and models, such as Bidirectional Encoder Representations from Transformer (BERT), will be described in more technical detail in this chapter with hands-on examples, including how to load a tokenizer/model and use community-provided pretrained models. But before using any specific model, we will understand the installation steps required to provide the necessary environment by using Anaconda. In the installation steps, installing libraries and programs on various operating systems such as Linux, Windows, and macOS will be covered. The installation of PyTorch and TensorFlow, in two versions of a Central Processing...

Technical requirements

You will need to install the libraries and software listed next. Although having the latest version is a plus, it is mandatory to install versions that are compatible with each other. For more information about the latest version installation for HuggingFace Transformer, take a look at their official web page at https://huggingface.co/Transformer/installation.html:

  • Anaconda
  • Transformer 4.0.0
  • PyTorch 1.1.0
  • TensorFlow 2.4.0
  • Datasets 1.4.1

Finally, all the code shown in this chapter is available in this book's GitHub repository at https://github.com/PacktPublishing/Mastering-Transformer/tree/main/CH02.

Check out the following link to see the Code in Action video: https://bit.ly/372ek48

Installing Transformer with Anaconda

Anaconda is a distribution of the Python and R programming languages that makes package distribution and deployment easy for scientific computation. In this chapter, we will describe the installation of the Transformer library. However, it is also possible to install this library without the aid of Anaconda. The main motivation to use Anaconda is to explain the process more easily and moderate the packages used.

To start installing the related libraries, the installation of Anaconda is a mandatory step. Official guidelines provided by the Anaconda documentation offer simple steps to install it for common operating systems (macOS, Windows, and Linux).

Installation on Linux

Many distributions of Linux are available for users to enjoy, but among them, Ubuntu is one of the preferred ones. In this section, the steps to install Anaconda are covered for Linux. Proceed as follows:

  1. Download the Anaconda installer for Linux from https://www...

Working with language models and tokenizers

In this section, we will look at using the Transformer library with language models, along with their related tokenizers. In order to use any specified language model, we first need to import it. We will start with the BERT model provided by Google and use its pretrained version, as follows:

>>> from Transformer import BERTTokenizer
>>> tokenizer = \
BERTTokenizer.from_pretrained('BERT-base-uncased')

The first line of the preceding code snippet imports the BERT tokenizer, and the second line downloads a pretrained tokenizer for the BERT base version. Note that the uncased version is trained with uncased letters, so it does not matter whether the letters appear in upper- or lowercase. To test and see the output, you must run the following line of code:

>>> text = "Using Transformer is easy!"
>>> tokenizer(text)

This will be the output:

{'input_ids': [101, 2478...

Working with community-provided models

Hugging Face has tons of community models provided by collaborators from large Artificial Intelligence (AI) and Information Technology (IT) companies such as Google and Facebook. There are also many interesting models that individuals and universities provide. Accessing and using them is also very easy. To start, you should visit the Transformer models directory available at their website (https://huggingface.co/models), as shown in the following screenshot:

Figure 2.11 – Hugging Face models repository

Apart from these models, there are also many good and useful datasets available for NLP tasks. To start using some of these models, you can explore them by keyword searches, or just specify your major NLP task and pipeline.

For example, we are looking for a table QA model. After finding a model that we are interested in, a page such as the following one will be available from the Hugging Face website (https://huggingface...

Working with benchmarks and datasets

Before introducing the datasets library, we'd better talk about important benchmarks such as General Language Understanding Evalution (GLUE), Cross-lingual TRansfer Evaluation of Multilingual Encoders (XTREME), and Stanford Question Answering Dataset (SquAD). Benchmarking is especially critical for transferring learnings within multitask and multilingual environments. In NLP, we mostly focus on a particular metric that is a performance score on a certain task or dataset. Thanks to the Transformer library, we are able to transfer what we have learned from a particular task to a related task, which is called Transfer Learning (TL). By transferring representations between related problems, we are able to train general-purpose models that share common linguistic knowledge across tasks, also known as Multi-Task Learning (MTL). Another aspect of TL is to transfer knowledge across natural languages (multilingual models).

Important benchmarks

...

Benchmarking for speed and memory

Just comparing the classification performance of large models on a specific task or a benchmark turns out to be no longer sufficient. We must now take care of the computational cost of a particular model for a given environment (Random-Access Memory (RAM), CPU, GPU) in terms of memory usage and speed. The computational cost of training and deploying to production for inference are two main values to be measured. Two classes of the Transformer library, PyTorchBenchmark and TensorFlowBenchmark, make it possible to benchmark models for both TensorFlow and PyTorch.

Before we start our experiment, we need to check our GPU capabilities with the following execution:

>>> import torch
>>> print(f"The GPU total memory is {torch.cuda.get_device_properties(0).total_memory /(1024**3)} GB")
The GPU total memory is 2.94921875 GB

The output is obtained from NVIDIA GeForce GTX 1050 (3 Gigabytes (GB)). We need more powerful resources...

Summary

In this chapter, we have covered a variety of introductory topics and also got our hands dirty with the hello-world transformer application. On the other hand, this chapter plays a crucial role in terms of applying what has been learned so far to the upcoming chapters. So, what has been learned so far? We took a first small step by setting the environment and system installation. In this context, the anaconda package manager helped us to install the necessary modules for the main operating systems. We also went through language models, community-provided models, and tokenization processes. Additionally, we introduced multitask (GLUE) and cross-lingual benchmarking (XTREME), which enables these language models to become stronger and more accurate. The datasets library was introduced, which facilitates efficient access to NLP datasets provided by the community. Finally, we learned how to evaluate the computational cost of a particular model in terms of memory usage and speed...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Mastering Transformers
Published in: Sep 2021Publisher: PacktISBN-13: 9781801077651
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Savaş Yıldırım

Savaş Yıldırım graduated from the Istanbul Technical University Department of Computer Engineering and holds a Ph.D. degree in Natural Language Processing (NLP). Currently, he is an associate professor at the Istanbul Bilgi University, Turkey, and is a visiting researcher at the Ryerson University, Canada. He is a proactive lecturer and researcher with more than 20 years of experience teaching courses on machine learning, deep learning, and NLP. He has significantly contributed to the Turkish NLP community by developing a lot of open source software and resources. He also provides comprehensive consultancy to AI companies on their R&D projects. In his spare time, he writes and directs short films, and enjoys practicing yoga.
Read more about Savaş Yıldırım

author image
Meysam Asgari- Chenaghlu

Meysam Asgari-Chenaghlu is an AI manager at Carbon Consulting and is also a Ph.D. candidate at the University of Tabriz. He has been a consultant for Turkey's leading telecommunication and banking companies. He has also worked on various projects, including natural language understanding and semantic search.
Read more about Meysam Asgari- Chenaghlu