Reader small image

You're reading from  The Definitive Guide to Google Vertex AI

Product typeBook
Published inDec 2023
PublisherPackt
ISBN-139781801815260
Edition1st Edition
Concepts
Right arrow
Authors (2):
Jasmeet Bhatia
Jasmeet Bhatia
author image
Jasmeet Bhatia

Jasmeet is a Machine Learning Architect with over 8 years of experience in Data Science and Machine Learning Engineering at Google and Microsoft, and overall has 17 years of experience in Product Engineering and Technology consulting at Deloitte, Disney, and Motorola. He has been involved in building technology solutions that focus on solving complex business problems by utilizing information and data assets. He has built high performing engineering teams, designed and built global scale AI/Machine Learning, Data Science, and Advanced analytics solutions for image recognition, natural language processing, sentiment analysis, and personalization.
Read more about Jasmeet Bhatia

Kartik Chaudhary
Kartik Chaudhary
author image
Kartik Chaudhary

​Kartik is an Artificial Intelligence and Machine Learning professional with 6+ years of industry experience in developing and architecting large scale AI/ML solutions using the technological advancements in the field of Machine Learning, Deep Learning, Computer Vision and Natural Language Processing. Kartik has filed 9 patents at the intersection of Machine Learning, Healthcare, and Operations. Kartik loves sharing knowledge, blogging, travel, and photography.
Read more about Kartik Chaudhary

View More author details
Right arrow

Document AI – An End-to-End Solution for Processing Documents

Almost every business relies on some kind of document to convey information daily. This can be in the form of emails, contracts, forms, PDFs, and so on. Because this data is unstructured, many businesses often fail to take advantage of the value coming from this data. If there is a way to convert this huge amount of data from documents into machine-readable format, it can help with many useful tasks, such as automating business processes, doing analytics, applying AI and ML, and more. Considering the size of the data, it’s often not possible to parse these documents manually to extract information. Tools such as optical character recognition (OCR) can help in partially automating the task of at least converting the document into text format, but it will still be unstructured and more effort is required to make it useful.

Document AI is Google Cloud’s managed service that converts unstructured content...

Technical requirements

The code examples shown in this chapter can be found in the following GitHub repository: https://github.com/PacktPublishing/The-Definitive-Guide-to-Google-Vertex-AI/tree/main/Chapter13.

What is Document AI?

Document AI is an end-to-end AI-based solution for extracting and classifying useful information from any kind of unstructured documents, including scanned images, PDFs, forms, emails, and contracts. Document AI’s solution includes pre-trained ML models for extraction and other document-related tasks, and it also provides the flexibility to uptrain existing models and train custom models without writing much code. Document AI is one unified solution that can help businesses manage the entire unstructured document life cycle, ensuring a high level of accuracy and low costs to accelerate deployment to meet customer expectations.

Some key features of Google Cloud’s Document AI platform are as follows:

  • Google’s state-of-the-art AI: The Document AI platform is built upon Google’s industry-leading AI innovations in various fields, including computer vision (including OCR), NLP, and semantic search, to make this platform highly accurate...

Overview of existing Document AI processors

As discussed previously, the Document AI platform provides prebuilt parsers for general-purpose, as well as some specialized, use cases. As these processors are prebuilt, they are readily available to use in any relevant use case with very little effort. Before jumping into an example of how these processors work, let’s first look at the list of available processors as part of Google Cloud’s Document AI platform:

  • Document OCR: Identify and extract both machine-printed as well as handwritten text from documents in over 200 languages
  • Form Parser: Extract key-value pairs (entity and checkbox), tables, and generic entities in addition to OCR text
  • Intelligent Document Quality Processor: Assesses the quality of documents based on their readability and provides a quality score
  • Document Splitter: Automatically splits documents based on logical boundaries

Document AI provides us with numerous specialized processors...

Creating custom Document AI processors

If we are unable to find a suitable prebuilt processor for our use case, Document AI Workbench lets us build and train our own tailored processors from scratch and with minimal effort. If we go to the Workbench tab inside Document AI, we’ll get the following options for creating a custom processor (see Figure 13.7):

Figure 13.7 – Document AI Workbench for creating custom model-based processors

Figure 13.7 – Document AI Workbench for creating custom model-based processors

In this exercise, we will work with the Custom Document Extractor solution to create a custom processor. Once we click on CREATE PROCESSOR, we will be able to find this processor within the My Processors tab. If we click on the processor, we will get options for training, evaluating, and testing our custom processor, as well as options for managing deployed versions of custom models. After training a version, we can also configure the Human-in-the-loop feature. See Figure 13.8 for these options:

Figure 13.8 – Custom Document AI processor details in the Google Cloud console UI...

Summary

This chapter highlighted the fact that every business or company uses many forms of documents (such as emails, contracts, forms, PDFs, and images) to share and store information. Document AI is an end-to-end solution on Google Cloud that lets us extract this information in a structured way such that it can be readily used to train ML models or perform other downstream tasks to make a lot of value out of the information within these documents.

By completing this chapter, you should now be confident about Document AI and its importance for every business. You should also have a good understanding of prebuilt processors within Document AI and should be able to integrate them into their application easily. Finally, if prebuilt processors don’t fulfill your expectations, there are options to build custom processors to meet the goal of your use case.

We now have a good understanding of Document AI on Google Cloud. In the next chapter, we will learn about more Google...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
The Definitive Guide to Google Vertex AI
Published in: Dec 2023Publisher: PacktISBN-13: 9781801815260
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (2)

author image
Jasmeet Bhatia

Jasmeet is a Machine Learning Architect with over 8 years of experience in Data Science and Machine Learning Engineering at Google and Microsoft, and overall has 17 years of experience in Product Engineering and Technology consulting at Deloitte, Disney, and Motorola. He has been involved in building technology solutions that focus on solving complex business problems by utilizing information and data assets. He has built high performing engineering teams, designed and built global scale AI/Machine Learning, Data Science, and Advanced analytics solutions for image recognition, natural language processing, sentiment analysis, and personalization.
Read more about Jasmeet Bhatia

author image
Kartik Chaudhary

​Kartik is an Artificial Intelligence and Machine Learning professional with 6+ years of industry experience in developing and architecting large scale AI/ML solutions using the technological advancements in the field of Machine Learning, Deep Learning, Computer Vision and Natural Language Processing. Kartik has filed 9 patents at the intersection of Machine Learning, Healthcare, and Operations. Kartik loves sharing knowledge, blogging, travel, and photography.
Read more about Kartik Chaudhary