Reader small image

You're reading from  Transformers for Natural Language Processing - Second Edition

Product typeBook
Published inMar 2022
PublisherPackt
ISBN-139781803247335
Edition2nd Edition
Right arrow
Author (1)
Denis Rothman
Denis Rothman
author image
Denis Rothman

Denis Rothman graduated from Sorbonne University and Paris-Diderot University, designing one of the very first word2matrix patented embedding and patented AI conversational agents. He began his career authoring one of the first AI cognitive Natural Language Processing (NLP) chatbots applied as an automated language teacher for Moet et Chandon and other companies. He authored an AI resource optimizer for IBM and apparel producers. He then authored an Advanced Planning and Scheduling (APS) solution used worldwide.
Read more about Denis Rothman

Right arrow

Let Your Data Do the Talking: Story, Questions, and Answers

Reading comprehension requires many skills. When we read a text, we notice the keywords and the main events and create mental representations of the content. We can then answer questions using our knowledge of the content and our representations. We also examine each question to avoid traps and making mistakes.

No matter how powerful they have become, transformers cannot answer open questions easily. An open environment means that somebody can ask any question on any topic, and a transformer would answer correctly. That is difficult but possible to some extent with GPT-3, as we will see in this chapter. However, transformers often use general domain training datasets in a closed question-and-answer environment. For example, critical answers in medical care and law interpretation will often require additional NLP functionality.

However, transformers cannot answer any question correctly regardless of whether the training...

Methodology

Question-answering is mainly presented as an NLP exercise involving a transformer and a dataset containing the ready-to-ask questions and answering those questions. The transformer is trained to answer the questions asked in this closed environment.

However, in more complex situations, reliable transformer model implementations require customized methods.

Transformers and methods

A perfect and efficient universal transformer model for question-answering or any other NLP task does not exist. The best model for a project is the one that produces the best outputs for a specific dataset and task.

The method outperforms models in many cases. For example, a suitable method with an average model often will produce more efficient results than a flawed method with an excellent model.

In this chapter, we will run DistilBERT, ELECTRA, and RoBERTa models. Some produce better performances than others.

However, performance does not guarantee a result in a critical...

Method 0: Trial and error

Question-answering seems very easy. Is that true? Let’s find out.

Open QA.ipynb, the Google Colab notebook we will be using in this chapter. We will run the notebook cell by cell.

Run the first cell to install Hugging Face’s transformers, the framework we will be implementing in this chapter:

!pip install -q transformers

Note: Hugging Face transformers continually evolve, updating libraries and modules to adapt to the market. If the default version doesn’t work, you might have to pin one with !pip install transformers==[version that runs with the other functions in the notebook].

We will now import Hugging Face’s pipeline, which contains many ready-to-use transformer resources. They provide high-level abstraction functions for the Hugging Face library resources to perform a wide range of tasks. We can access those NLP tasks through a simple API. The program was created on Google Colab. It recommended...

Method 1: NER first

This section will use NER to help us find ideas for good questions. Transformer models are continuously trained and updated. Also, the datasets used for training might change. Finally, these are not rule-based algorithms that produce the same result each time. The outputs might change from one run to another. NER can detect people, locations, organizations, and other entities in a sequence. We will first run a NER task that will give us some of the main parts of the paragraph we can focus on to ask questions.

Using NER to find questions

We will continue to run QA.ipynb cell by cell. The program now initializes the pipeline with the NER task to perform with the default model and tokenizer:

nlp_ner = pipeline("ner")

We will continue to use the deceptively simple sequence we ran in the Method 0: Trial and error section of this chapter:

sequence = "The traffic began to slow down on Pioneer Boulevard in Los Angeles, making it difficult...

Method 2: SRL first

The transformer could not find who was driving to go to Las Vegas and thought it was from Nat King Cole instead of Jo and Maria.

What went wrong? Can we see what the transformers think and obtain an explanation? To find out, let’s go back to semantic role modeling. If necessary, take a few minutes to review Chapter 10, Semantic Role Labeling with BERT-Based Transformers.

Let’s run the same sequence on AllenNLP in the Semantic Role Labeling section, https://demo.allennlp.org/semantic-role-labeling, to obtain a visual representation of the verb drove in our sequence by running the SRL BERT model we used in the previous chapter:

Graphical user interface, text, application  Description automatically generated

Figure 11.2: SRL run on the text

SRL BERT found 19 frames. In this section, we focus on drove.

Note: The results may vary from one run to another or when AllenNLP updates the model versions.

We can see the problem. The argument of the verb drove is Jo and Maria. It seems that the inference...

Next steps

There is no easy way to implement question-answering or shortcuts. We began to implement methods that could generate questions automatically. Automatic question generation is a critical aspect of NLP.

More transformer models need to be pretrained with multi-task datasets containing NER, SRL, and question-answering problems to solve. Project managers also need to learn how to combine several NLP tasks to help solve a specific task, such as question-answering.

Coreference resolution, https://demo.allennlp.org/coreference-resolution, could have helped our model identify the main subjects in the sequence we worked on. This result produced with AllenNLP shows an interesting analysis:

Graphical user interface, text, application  Description automatically generated

Figure 11.8: Coreference resolution of a sequence

We could continue to develop our program by adding the output of coreference resolution:

Set0={'Los Angeles', 'the city,' 'LA'}
Set1=[Jo and Maria, their, they}

We could add coreference...

Summary

In this chapter, we found that question-answering isn’t as easy as it seems. Implementing a transformer model only takes a few minutes. However, getting it to work can take a few hours or several months!

We first asked the default transformer in the Hugging Face pipeline to answer some simple questions. DistilBERT, the default transformer, answered the simple questions quite well. However, we chose easy questions. In real life, users ask all kinds of questions. The transformer can get confused and produce erroneous output.

We then decided to continue to ask random questions and get random answers, or we could begin to design the blueprint of a question generator, which is a more productive solution.

We started by using NER to find useful content. We designed a function that could automatically create questions based on NER output. The quality was promising but required more work.

We tried an ELECTRA model that did not produce the results we expected...

Questions

  1. A trained transformer model can answer any question. (True/False)
  2. Question-answering requires no further research. It is perfect as it is. (True/False)
  3. Named Entity Recognition (NER) can provide useful information when looking for meaningful questions. (True/False)
  4. Semantic Role Labeling (SRL) is useless when preparing questions. (True/False)
  5. A question generator is an excellent way to produce questions. (True/False)
  6. Implementing question answering requires careful project management. (True/False)
  7. ELECTRA models have the same architecture as GPT-2. (True/False)
  8. ELECTRA models have the same architecture as BERT but are trained as discriminators. (True/False)
  9. NER can recognize a location and label it as I-LOC. (True/False)
  10. NER can recognize a person and label that person as I-PER. (True/False)

References

Join our book’s Discord space

Join the book’s Discord workspace for a monthly Ask me Anything session with the authors:

https://www.packt.link/Transformers

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Transformers for Natural Language Processing - Second Edition
Published in: Mar 2022Publisher: PacktISBN-13: 9781803247335
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Denis Rothman

Denis Rothman graduated from Sorbonne University and Paris-Diderot University, designing one of the very first word2matrix patented embedding and patented AI conversational agents. He began his career authoring one of the first AI cognitive Natural Language Processing (NLP) chatbots applied as an automated language teacher for Moet et Chandon and other companies. He authored an AI resource optimizer for IBM and apparel producers. He then authored an Advanced Planning and Scheduling (APS) solution used worldwide.
Read more about Denis Rothman