You're reading from Transformers for Natural Language Processing - Second Edition

Product typeBook

Published inMar 2022

PublisherPackt

ISBN-139781803247335

Edition2nd Edition

Concepts

Mobile Application Development

Author (1)

Denis Rothman

Chapter 1, What are Transformers?

We are still in the Third Industrial Revolution. (True/False)
False. Eras in history indeed overlap. However, the Third Industrial Revolution focused on making the world digital. The Fourth Industrial Revolution has begun to connect everything to everything else: systems, machines, bots, robots, algorithms, and more.

The Fourth Industrial Revolution is connecting everything to everything else. (True/False)
True. This leads to an increasing amount of automated decisions that formerly required human intervention.

Industry 4.0 developers will sometimes have no AI development to do. (True/False)
True. In some projects, AI will be an online service that requires no development.

Industry 4.0 developers might have to implement transformers from scratch. (True/False)
True. In some projects, not all, standard online services or APIs might not satisfy the needs of a project. There...

Chapter 2, Getting Started with the Architecture of the Transformer Model

NLP transduction can encode and decode text representations. (True/False)
True. NLP is transduction that converts sequences (written or oral) into numerical representations, processes them, and decodes the results back into text.

Natural Language Understanding (NLU) is a subset of Natural Language Processing (NLP). (True/False)
True.

Language modeling algorithms generate probable sequences of words based on input sequences. (True/False)
True.

A transformer is a customized LSTM with a CNN layer. (True/False)
False. A transformer does not contain an LSTM or a CNN at all.

A transformer does not contain LSTM or CNN layers. (True/False)
True.

Attention examines all the tokens in a sequence, not just the last one. (True/False)
True.

A transformer does not use positional encoding...

Chapter 3, Fine-Tuning BERT Models

BERT stands for Bidirectional Encoder Representations from Transformers. (True/False)
True.

BERT is a two-step framework. Step 1 is pretraining. Step 2 is fine-tuning. (True/False)
True.

Fine-tuning a BERT model implies training parameters from scratch. (True/False)
False. BERT fine-tuning is initialized with the trained parameters of pretraining.

BERT only pretrains using all downstream tasks. (True/False)
False.

BERT pretrains on Masked Language Modeling (MLM). (True/False)
True.

BERT pretrains on Next Sentence Prediction (NSP). (True/False)
True.

BERT pretrains on mathematical functions. (True/False)
False.

A question-answer task is a downstream task. (True/False)
True.

A BERT pretraining model does not require tokenization. (True/False)
False.

...

Chapter 4, Pretraining a RoBERTa Model from Scratch

RoBERTa uses a byte-level byte-pair encoding tokenizer. (True/False)
True.

A trained Hugging Face tokenizer produces merges.txt and vocab.json. (True/False)
True.

RoBERTa does not use token-type IDs. (True/False)
True.

DistilBERT has 6 layers and 12 heads. (True/False)
True.

A transformer model with 80 million parameters is enormous. (True/False)
False. 80 million parameters is a small model.

We cannot train a tokenizer. (True/False)
False. A tokenizer can be trained.

A BERT-like model has six decoder layers. (True/False)
False. BERT contains six encoder layers, not decoder layers.

MLM predicts a word contained in a mask token in a sentence. (True/False)
True.

A BERT-like model has no self-attention sublayers. (True/False)
False. BERT has self...

Chapter 5, Downstream NLP Tasks with Transformers

Machine intelligence uses the same data as humans to make predictions. (True/False)
True and False.

True. In some cases, machine intelligence surpasses humans when processing massive amounts of data to extract meaning and perform a range of tasks that would take centuries for humans to process.

False. For NLU, humans have access to more information through their senses. Machine intelligence relies on what humans provide for all types of media.

SuperGLUE is more difficult than GLUE for NLP models. (True/False)
True.

BoolQ expects a binary answer. (True/False)
True.

WiC stands for Words in Context. (True/False)
True.

Recognizing Textual Entailment (RTE) detects whether one sequence entails another sequence. (True/False)
True.

A Winograd schema predicts whether a verb is spelled correctly. (True/False) ...

Chapter 6, Machine Translation with the Transformer

Machine translation has now exceeded human baselines. (True/False)
False. Machine translation is one of the most challenging NLP ML tasks.

Machine translation requires large datasets. (True/False)
True.

There is no need to compare transformer models using the same datasets. (True/False)
False. The only way to compare different models is to use the same datasets.

BLEU is the French word for blue and is the acronym of an NLP metric. (True/False)
True. BLEU stands for Bilingual Evaluation Understudy Score, making it easy to remember.

Smoothing techniques enhance BERT. (True/False)
True.

German-English is the same as English-German for machine translation. (True/False)
False. Representing German and then translating into another language is not the same process as representing English and translating into another...

Chapter 7, The Rise of Suprahuman Transformers with GPT-3 Engines

A zero-shot method trains the parameters once. (True/False)
False. No parameters are trained.

Gradient updates are performed when running zero-shot models. (True/False)
False.

GPT models only have a decoder stack. (True/False)
True.

It is impossible to train a 117M GPT model on a local machine. (True/False)
False. We trained one in this chapter.

It is impossible to train the GPT-2 model with a specific dataset. (True/False)
False. We trained one in this chapter.

A GPT-2 model cannot be conditioned to generate text. (True/False)
False. We implemented this in this chapter.

A GPT-2 model can analyze the context of input and produce completion content. (True/False)
True.

We cannot interact with a 345M GTP parameter model on a machine with fewer than eight GPUs....

Chapter 8, Applying Transformers to Legal and Financial Documents for AI Text Summarization

T5 models only have encoder stacks like BERT models. (True/False)
False.

T5 models have both encoder and decoder stacks. (True/False)
True.

T5 models use relative positional encoding, not absolute positional encoding. (True/False)
True.

Text-to-text models are only designed for summarization. (True/False)
False.

Text-to-text models apply a prefix to the input sequence that determines the NLP task. (True/False)
True.

T5 models require specific hyperparameters for each task. (True/False)
False.

One of the advantages of text-to-text models is that they use the same hyperparameters for all NLP tasks. (True/False)
True.

T5 transformers do not contain a feedforward network. (True/False)
False.

Hugging Face is a framework...

Chapter 9, Matching Tokenizers and Datasets

A tokenized dictionary contains every word that exists in a language. (True/False)
False.

Pretrained tokenizers can encode any dataset. (True/False)
False.

It is good practice to check a database before using it. (True/False)
True.

It is good practice to eliminate obscene data from datasets. (True/False)
True.

It is good practice to delete data containing discriminating assertions. (True/False)
True.

Raw datasets might sometimes produce relationships between noisy content and useful content. (True/False)
True.

A standard pretrained tokenizer contains the English vocabulary of the past 700 years. (True/False)
False.

Old English can create problems when encoding data with a tokenizer trained in modern English. (True/False)
True.

Medical and other types of jargon...

Chapter 10, Semantic Role Labeling with BERT-Based Transformers

Semantic Role Labeling (SRL) is a text generation task. (True/False)
False.

A predicate is a noun. (True/False)
False.

A verb is a predicate. (True/False)
True.

Arguments can describe who and what is doing something. (True/False)
True.

A modifier can be an adverb. (True/False)
True.

A modifier can be a location. (True/False)
True.

A BERT-based model contains encoder and decoder stacks. (True/False)
False.

A BERT-based SRL model has standard input formats. (True/False)
True.

Transformers can solve any SRL task. (True/False)
False.

Chapter 11, Let Your Data Do the Talking: Story, Questions, and Answers

A trained transformer model can answer any question. (True/False)
False.

Question-answering requires no further research. It is perfect as it is. (True/False)
False.

Named Entity Recognition (NER) can provide useful information when looking for meaningful questions. (True/False)
True.

Semantic Role Labeling (SRL) is useless when preparing questions. (True/False)
False.

A question generator is an excellent way to produce questions. (True/False)
True.

Implementing question-answering requires careful project management. (True/False)
True.

ELECTRA models have the same architecture as GPT-2. (True/False)
False.

ELECTRA models have the same architecture as BERT but are trained as discriminators. (True/False)
True.

NER can recognize a location...

Chapter 12, Detecting Customer Emotions to Make Predictions

It is not necessary to pretrain transformers for sentiment analysis. (True/False)
False.

A sentence is always positive or negative. It cannot be neutral. (True/False)
False.

The principle of compositionality signifies that a transformer must grasp every part of a sentence to understand it. (True/False)
True.

RoBERTa-large was designed to improve the pretraining process of transformer models. (True/False)
True.

A transformer can provide feedback that informs us of whether a customer is satisfied or not. (True/False)
True.

If the sentiment analysis of a product or service is consistently negative, it helps us make appropriate decisions to improve our offer. (True/False)
True.

If a model fails to provide a good result on a task, it requires more training or fine-tuning before changing models...

Chapter 13, Analyzing Fake News with Transformers

News labeled as fake news is always fake. (True/False)
False.

News that everybody agrees with is always accurate. (True/False)
False.

Transformers can be used to run sentiment analysis on Tweets. (True/False)
True.

Key entities can be extracted from Facebook messages with a DistilBERT model running NER. (True/False)
True.

Key verbs can be identified from YouTube chats with BERT-based models running SRL. (True/False)
True.

Emotional reactions are a natural first response to fake news. (True/False)
True.

A rational approach to fake news can help clarify one’s position. (True/False)
True.

Connecting transformers to reliable websites can help somebody understand why some news is fake. (True/False)
True.

Transformers can make summaries of reliable websites...

Chapter 14, Interpreting Black Box Transformer Models

BERTViz only shows the output of the last layer of the BERT model. (True/False)
False. BERTViz displays the outputs of all the layers.

BERTViz shows the attention heads of each layer of a BERT model. (True/False)
True.

BERTViz shows how the tokens relate to each other. (True/False)
True.

LIT shows the inner workings of the attention heads like BERTViz. (True/False)
False. However, LIT makes non-probing predictions.

Probing is a way for an algorithm to predict language representations. (True/False)
True.

NER is a probing task. (True/False)
True.

PCA and UMAP are non-probing tasks. (True/False)
True.

LIME is model-agnostic. (True/False)
True.

Transformers deepen the relationships of the tokens layer by layer. (True/False)
True.

...

Chapter 15, From NLP to Task-Agnostic Transformer Models

Reformer transformer models don’t contain encoders. (True/False)
False. Reformer transformer models contain encoders.

Reformer transformer models don’t contain decoders. (True/False)
False. Reformer transformer models contain encoders and decoders.

The inputs are stored layer by layer in Reformer models. (True/False)
False. The inputs are recomputed at each level, thus saving memory.

DeBERTa transformer models disentangle content and positions. (True/False)
True.

It is necessary to test the hundreds of pretrained transformer models before choosing one for a project. (True/False)
True and False. You can try all of the models, or you can choose a very reliable model and implement it to fit your needs.

The latest transformer model is always the best. (True/False)
True and false. A lot of research...

Chapter 16, The Emergence of Transformer-Driven Copilots

AI copilots that can generate code automatically do not exist. (True/False)
False. GitHub Copilot, for example, is now in production.

AI copilots will never replace humans. (True/False)
True and false. AI will take over many tasks in sales, support, maintenance, and other domains. However, many complex tasks will still require human intervention.

GPT-3 engines can only do one task. (True/False)
False. GPT-3 engines can do a wide variety of tasks.

Transformers can be trained to be recommenders. (True/False)
True. Transformers have gone from language sequences to sequences in many domains.

Transformers can only process language. (True/False)
False. Once transformers are trained for language sequences, they can analyze many other types of sequences.

A transformer sequence can only contain words. (True/False) ...

Chapter 17, The Consolidation of Suprahuman Transformers with OpenAI’s ChatGPT and GPT-4

GPT-4 is sentient. (True/False)
False. GPT-4 is a mathematical algorithm. It does not need to be sentient to learn statistical patterns to do a wide variety of tasks.
ChatGPT can replace a human expert. (True/False)
False. ChatGPT can produce results based on its datasets. However, it cannot make subject matter expert (SME) decisions.
GPT-4 can generate source code for any task? (True/False)
False. GPT-4 can generate source code for many tasks. However, for complex problems, human intervention is required.
Advanced prompt engineering is intuitive. (True/False)
False. Advanced prompt engineering has become a skill that is based on in-depth knowledge of transformers. Advanced prompt engineering involves building knowledge bases, multiple types of objects for the completion of APIs, and understanding the many models available.
The most...

The rest of the chapter is locked

You have been reading a chapter from

Transformers for Natural Language Processing - Second Edition

Published in: Mar 2022Publisher: PacktISBN-13: 9781803247335

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Denis Rothman

Denis Rothman graduated from Sorbonne University and Paris-Diderot University, designing one of the very first word2matrix patented embedding and patented AI conversational agents. He began his career authoring one of the first AI cognitive Natural Language Processing (NLP) chatbots applied as an automated language teacher for Moet et Chandon and other companies. He authored an AI resource optimizer for IBM and apparel producers. He then authored an Advanced Planning and Scheduling (APS) solution used worldwide.
Read more about Denis Rothman

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages