Reader small image

You're reading from  Transformers for Natural Language Processing and Computer Vision - Third Edition

Product typeBook
Published inFeb 2024
Reading LevelN/a
PublisherPackt
ISBN-139781805128724
Edition3rd Edition
Languages
Tools
Right arrow
Author (1)
Denis Rothman
Denis Rothman
author image
Denis Rothman

Denis Rothman graduated from Sorbonne University and Paris-Diderot University, designing one of the very first word2matrix patented embedding and patented AI conversational agents. He began his career authoring one of the first AI cognitive Natural Language Processing (NLP) chatbots applied as an automated language teacher for Moet et Chandon and other companies. He authored an AI resource optimizer for IBM and apparel producers. He then authored an Advanced Planning and Scheduling (APS) solution used worldwide.
Read more about Denis Rothman

Right arrow

Fine-tuning GPT-3 for completion (generative)

OpenAI (at the time of the writing of this book) has a service to fine-tune the following original GPT-3 models: davinci, curie, babbage, and ada. They are original models and, as such, have no suffixes. GPT-4 models are not available for fine-tuning at the time of the writing of this book. However, if GPT-4 models become available for fine-tuning, the same or similar process as for GPT-3 will apply.A fine-tuned model can perform data exploration, classification, question answering, and other NLP tasks like the original models. As such, the fined-tuned model might produce acceptable or inaccurate results. Quality control remains essential. Make sure to go through OpenAI's documentation before beginning a project: https://platform.openai.com/docs/guides/fine-tuning/This section aims to implement the fine-tuning process of a model in a notebook, cell by cell, so you can apply fine-tuning to your specific domain.Fine-tuning GPT-3 models...

Fine-tuned for classification (discriminative)

The miracle of generative models, such as GPTs, is that they can perform a classification task with the right prompts!In this section, we will fine-tune babbage-002 to classify baseball and hockey text inputs. You will see that you can fine-tune an original OpenAI model to a wide range of tasks. Your imagination will be the limit!Open Fine-tuned_classification.ipynb in the chapter directory in the GitHub repository. The structure of the notebook is the same as Fine_tuning_GPT_3.ipynb notebook we just created. The main section titles are identical.Fine_tuning_GPT_3.ipynb was created for completion tasks with text as a prompt and completion, although you can modify it for any other NLP task. Fine-tuned_classification.ipynb is designed to classify baseball and hockey texts. You can adapt this notebook to other NLP tasks once you have explored it.The dataset is designed for classification tasks, but the process is the same as the one we went...

Summary

This chapter led us to the potential of adapting an OpenAI model to our needs through fine-tuning. The process requires careful data analysis and preparation. We must also determine if fine-tuning using OpenAI's platform does not violate our privacy, confidentiality, and security requirements.We first built a fine-tuning process for a completion(generative) task by loading a pre-processed dataset of Immanuel Kant's Critique of Pure Reason. We submitted it to OpenAI's data preparation tool. The tool converted our data into JSONL. An ada model was fine-tuned and stored. We then ran the model.Then the babbage-002 model was fine-tuned for a classification (discriminative) task. This process brought us back to square one: can a standard OpenAI model achieve the same results as a fine-tuned model? If so, why bother fine-tuning a model?To satisfy our scientific curiosity, we ran davinci on the same task as the trained ada to classify a text to determine if it was about...

Questions

  1. It is useless to fine-tune an OpenAI model. (True/False)
  2. Any pretrained OpenAI model can do the task we need without fine-tuning. (True/False)
  3. We don't need to prepare a dataset to fine-tune an OpenAI model. (True/False)
  4. We don't need one if no datasets are available on the web. (follow-up question for question 3.) (True/False)
  5. We don't need to keep track of the fine-tunes we created. (True/False)
  6. As of July 2023, anybody can access our fine-tunes. (True/False)
  7. Wandb is a state-of-art transformer model. (True/False)
  8. Wandb can be synced with OpenAI models. (True/False)
  9. Unfortunately, Wandb cannot display accuracy. (True/False)
  10. The lineage of the fine-tunes is one of Wandb's artifacts. (True/False)

Further Reading

Weights and Biases articles: https://wandb.ai/site/articlesFine-tuning research: Fine-Tuning Language Models with Just Forward Passes, Maddadi, et al.(2023), https://arxiv.org/abs/2305.17333

Join our book's Discord space

Join the book's Discord workspace:https://www.packt.link/Transformers

A picture containing black, darkness Description automatically generated

References

Further reading

Join our community on Discord

Join our community’s Discord space for discussions with the authors and other readers:

https://www.packt.link/Transformers

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Transformers for Natural Language Processing and Computer Vision - Third Edition
Published in: Feb 2024Publisher: PacktISBN-13: 9781805128724
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Denis Rothman

Denis Rothman graduated from Sorbonne University and Paris-Diderot University, designing one of the very first word2matrix patented embedding and patented AI conversational agents. He began his career authoring one of the first AI cognitive Natural Language Processing (NLP) chatbots applied as an automated language teacher for Moet et Chandon and other companies. He authored an AI resource optimizer for IBM and apparel producers. He then authored an Advanced Planning and Scheduling (APS) solution used worldwide.
Read more about Denis Rothman