Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Transformers for Natural Language Processing and Computer Vision - Third Edition
Transformers for Natural Language Processing and Computer Vision - Third Edition

Transformers for Natural Language Processing and Computer Vision: Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3, Third Edition

By Denis Rothman
€32.99 €22.99
Book Feb 2024 728 pages 3rd Edition
eBook
€32.99 €22.99
Print
€41.99
Subscription
€14.99 Monthly
eBook
€32.99 €22.99
Print
€41.99
Subscription
€14.99 Monthly

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Feb 29, 2024
Length 728 pages
Edition : 3rd Edition
Language : English
ISBN-13 : 9781805128724
Vendor :
OpenAI
Category :
Languages :
Concepts :
Table of content icon View table of contents Preview book icon Preview Book

Transformers for Natural Language Processing and Computer Vision - Third Edition

Transformers for Natural Language Processing and Computer Vision, Third Edition: Take Generative AI and LLMs to the next level with Hugging Face, Google Vertex AI, ChatGPT, GPT-4V, and DALL-E 3

Welcome to Packt Early Access. We’re giving you an exclusive preview of this book before it goes on sale. It can take many months to write a book, but our authors have cutting-edge information to share with you today. Early Access gives you an insight into the latest developments by making chapter drafts available. The chapters may be a little rough around the edges right now, but our authors will update them over time.You can dip in and out of this book or follow along from start to finish; Early Access is designed to be flexible. We hope you enjoy getting to know more about the process of writing a Packt book.

  1. Chapter 1: What are Transformers?
  2. Chapter 2: Getting Started with the Architecture of the Transformer Model
  3. Chapter 3: Emergent vs Downstream Tasks:...

How constant time complexity O(1) changed our lives forever

How could this deceivingly simple O(1) time complexity class forever change AI and our everyday lives? How could O(1) explain the profound architectural changes that made ChatGPT so powerful and stunned the world? How can something as simple as O(1) allow systems like ChatGPT to spread to every domain and hundreds of tasks?

The answer to these questions is the only way to find your way in the growing maze of transformer datasets, models, and applications is to focus on the underlying concepts of thousands of assets. Those concepts will take you to the core of the functionality you need for your projects.

This section will provide a significant answer to those questions before we move on to see how one token (a minimal piece of a word) started an AI revolution that is raging around the world, triggering automation never seen before.

We need to get to the bottom of the chaos and disruption generated by transformers...

From one token to an AI revolution

Yes, the title is correct, as you will see in this section. One token produced an AI revolution and has opened the door to AI in every domain and application.

ChatGPT with GPT-4, PaLM 2, and other LLMs have a unique way of producing text.

In LLMs, a token is a minimal word part. The token is where a Large Language Model starts and ends.

For example, the word including could become: includ + ing, representing two tokens. GPT models predict tokens based on the hundreds of billions of tokens in its training dataset. Examine the graph in Figure 1.9 of an OpenAI GPT model that is making an inference to produce a token:

A diagram of a diagram  Description automatically generated

Figure 1.9: GPT inference graph built in Python with NetworkX

It may come as a surprise, but the only parts of this figure controlled by the model are Model and Output Generation!, which produce raw logits. All the rest is in the pipeline.

To understand the pipeline, we will first go through the description...

Foundation Models

Advanced large multipurpose transformer models represent such a paradigm change that they require a new name to describe them: Foundation Models. Accordingly, Stanford University created the Center for Research on Foundation Models (CRFM). In August 2021, the CRFM published a two-hundred-page paper (see the References section) written by over one hundred scientists and professionals: On the Opportunities and Risks of Foundation Models.

Foundation Models were not created by academia but by the big tech industry. Google invented the transformer model, leading to Google BERT, LaMBDA, PaLM 2, and more. Microsoft partnered with OpenAI to produce ChatGPT with GPT-4, and soon more.

Big tech had to find a better model to face the exponential increase of petabytes of data flowing into their data centers. Transformers were thus born out of necessity.

Let’s consider the evolution of LLMs to understand the need for industrialized AI models.

Transformers...

The role of AI professionals

Transformer-driven AI is connecting everything to everything, everywhere. Machines communicate directly with other machines. AI-driven IoT signals trigger automated decisions without human intervention. NLP algorithms send automated reports, summaries, emails, advertisements, and more.

AI specialists must adapt to this new era of increasingly automated tasks, including transformer model implementations. AI professionals will have new functions. If we list transformer NLP tasks that an AI specialist will have to do, from top to bottom, it appears that some high-level tasks require little to no development from an AI specialist. An AI specialist can be an AI guru, providing design ideas, explanations, and implementations.

The pragmatic definition of what a transformer represents for an AI specialist will vary with the ecosystem.

Let’s go through a few examples:

  • API: The OpenAI API does not require an AI developer. A web designer...

The rise of transformer seamless APIs and assistants

We are now well into the industrialization era of AI. Microsoft Azure, Google Cloud, Amazon Web Services (AWS), and IBM, among others, provide AI services that no developer or team of developers could hope to outperform. Tech giants have million-dollar supercomputers with massive datasets to train transformer models and AI models in general.

Big tech giants have many corporate customers that already use their cloud services. As a result, adding a transformer API to an existing cloud architecture requires less effort than any other solution.

A small company or even an individual can access the most powerful transformer models through an API with practically no investment in development. An intern can implement the API in a few days. There is no need to be an engineer or have a Ph.D. for such a simple implementation.

For example, the OpenAI platform now has a Software as a Service (SaaS) API for some of the most effective...

Summary

Transformers forced AI to make profound evolutions. Foundation Models, including their Generative AI abilities, are built on top of the digital revolution connecting everything to everything with underlying processes everywhere. Automated processes are replacing human decisions in critical areas, including NLP.

RNNs slowed the progression of automated NLP tasks required in a fast-moving world. Transformers filled the gap. A corporation needs summarization, translation, and a wide range of NLP tools to meet the challenges of the growing volume of incoming information.

Transformers have thus spurred an age of AI industrialization. We first saw how the O(1) time complexity of the attention layers and their computational time complexity, O(n2*d), shook the world of AI.

We saw how the one-token flexibility of transformer models pervaded every domain of our everyday lives!

Platforms such as Hugging Face, Google Cloud, OpenAI, and Microsoft Azure provide NLP tasks...

Questions

  1. ChatGPT is a game-changer. (True/False)
  2. ChatGPT can replace all AI algorithms. (True/False)
  3. AI developers will sometimes have no AI development to do. (True/False)
  4. AI developers might have to implement transformers from scratch. (True/False)
  5. It’s not necessary to learn more than one transformer ecosystem, such as Hugging Face. (True/False)
  6. A ready-to-use transformer API can satisfy all needs. (True/False)
  7. A company will accept the transformer ecosystem a developer knows best. (True/False)
  8. Cloud transformers have become mainstream. (True/False)
  9. A transformer project can be run on a laptop. (True/False)
  10. AI specialists will have to be more flexible. (True/False)

References

Further reading

Join our community on Discord

Join our community’s Discord space for discussions with the authors and other readers:

https://www.packt.link/Transformers

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Compare and contrast 20+ models (including GPT-4, BERT, and Llama 2) and multiple platforms and libraries to find the right solution for your project
  • Apply RAG with LLMs using customized texts and embeddings
  • Mitigate LLM risks, such as hallucinations, using moderation models and knowledge bases
  • Purchase of the print or Kindle book includes a free eBook in PDF format

Description

Transformers for Natural Language Processing and Computer Vision, Third Edition, explores Large Language Model (LLM) architectures, applications, and various platforms (Hugging Face, OpenAI, and Google Vertex AI) used for Natural Language Processing (NLP) and Computer Vision (CV). The book guides you through different transformer architectures to the latest Foundation Models and Generative AI. You’ll pretrain and fine-tune LLMs and work through different use cases, from summarization to implementing question-answering systems with embedding-based search techniques. You will also learn the risks of LLMs, from hallucinations and memorization to privacy, and how to mitigate such risks using moderation models with rule and knowledge bases. You’ll implement Retrieval Augmented Generation (RAG) with LLMs to improve the accuracy of your models and gain greater control over LLM outputs. Dive into generative vision transformers and multimodal model architectures and build applications, such as image and video-to-text classifiers. Go further by combining different models and platforms and learning about AI agent replication. This book provides you with an understanding of transformer architectures, pretraining, fine-tuning, LLM use cases, and best practices.

What you will learn

Breakdown and understand the architectures of the Original Transformer, BERT, GPT models, T5, PaLM, ViT, CLIP, and DALL-E Fine-tune BERT, GPT, and PaLM 2 models Learn about different tokenizers and the best practices for preprocessing language data Pretrain a RoBERTa model from scratch Implement retrieval augmented generation and rules bases to mitigate hallucinations Visualize transformer model activity for deeper insights using BertViz, LIME, and SHAP Go in-depth into vision transformers with CLIP, DALL-E 2, DALL-E 3, and GPT-4V

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Feb 29, 2024
Length 728 pages
Edition : 3rd Edition
Language : English
ISBN-13 : 9781805128724
Vendor :
OpenAI
Category :
Languages :
Concepts :

Table of Contents

24 Chapters
Preface Chevron down icon Chevron up icon
What Are Transformers? Chevron down icon Chevron up icon
Getting Started with the Architecture of the Transformer Model Chevron down icon Chevron up icon
Emergent vs Downstream Tasks: The Unseen Depths of Transformers Chevron down icon Chevron up icon
Advancements in Translations with Google Trax, Google Translate, and Gemini Chevron down icon Chevron up icon
Diving into Fine-Tuning through BERT Chevron down icon Chevron up icon
Pretraining a Transformer from Scratch through RoBERTa Chevron down icon Chevron up icon
The Generative AI Revolution with ChatGPT Chevron down icon Chevron up icon
Fine-Tuning OpenAI GPT Models Chevron down icon Chevron up icon
Shattering the Black Box with Interpretable Tools Chevron down icon Chevron up icon
Investigating the Role of Tokenizers in Shaping Transformer Models Chevron down icon Chevron up icon
Leveraging LLM Embeddings as an Alternative to Fine-Tuning Chevron down icon Chevron up icon
Toward Syntax-Free Semantic Role Labeling with ChatGPT and GPT-4 Chevron down icon Chevron up icon
Summarization with T5 and ChatGPT Chevron down icon Chevron up icon
Exploring Cutting-Edge LLMs with Vertex AI and PaLM 2 Chevron down icon Chevron up icon
Guarding the Giants: Mitigating Risks in Large Language Models Chevron down icon Chevron up icon
Beyond Text: Vision Transformers in the Dawn of Revolutionary AI Chevron down icon Chevron up icon
Transcending the Image-Text Boundary with Stable Diffusion Chevron down icon Chevron up icon
Hugging Face AutoTrain: Training Vision Models without Coding Chevron down icon Chevron up icon
On the Road to Functional AGI with HuggingGPT and its Peers Chevron down icon Chevron up icon
Beyond Human-Designed Prompts with Generative Ideation Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon
Appendix: Answers to the Questions Chevron down icon Chevron up icon

Customer reviews

Filter icon Filter
Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%

Filter reviews by


No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.