You're reading from The Definitive Guide to Google Vertex AI

Product typeBook

Published inDec 2023

PublisherPackt

ISBN-139781801815260

Edition1st Edition

Concepts

Data Science

Authors (2):

Jasmeet Bhatia

Kartik Chaudhary

View More author details

Document AI – An End-to-End Solution for Processing Documents

Almost every business relies on some kind of document to convey information daily. This can be in the form of emails, contracts, forms, PDFs, and so on. Because this data is unstructured, many businesses often fail to take advantage of the value coming from this data. If there is a way to convert this huge amount of data from documents into machine-readable format, it can help with many useful tasks, such as automating business processes, doing analytics, applying AI and ML, and more. Considering the size of the data, it’s often not possible to parse these documents manually to extract information. Tools such as optical character recognition (OCR) can help in partially automating the task of at least converting the document into text format, but it will still be unstructured and more effort is required to make it useful.

Document AI is Google Cloud’s managed service that converts unstructured content...

Technical requirements

The code examples shown in this chapter can be found in the following GitHub repository: https://github.com/PacktPublishing/The-Definitive-Guide-to-Google-Vertex-AI/tree/main/Chapter13.

What is Document AI?

Document AI is an end-to-end AI-based solution for extracting and classifying useful information from any kind of unstructured documents, including scanned images, PDFs, forms, emails, and contracts. Document AI’s solution includes pre-trained ML models for extraction and other document-related tasks, and it also provides the flexibility to uptrain existing models and train custom models without writing much code. Document AI is one unified solution that can help businesses manage the entire unstructured document life cycle, ensuring a high level of accuracy and low costs to accelerate deployment to meet customer expectations.

Some key features of Google Cloud’s Document AI platform are as follows:

Google’s state-of-the-art AI: The Document AI platform is built upon Google’s industry-leading AI innovations in various fields, including computer vision (including OCR), NLP, and semantic search, to make this platform highly accurate...

Overview of existing Document AI processors

As discussed previously, the Document AI platform provides prebuilt parsers for general-purpose, as well as some specialized, use cases. As these processors are prebuilt, they are readily available to use in any relevant use case with very little effort. Before jumping into an example of how these processors work, let’s first look at the list of available processors as part of Google Cloud’s Document AI platform:

Document OCR: Identify and extract both machine-printed as well as handwritten text from documents in over 200 languages
Form Parser: Extract key-value pairs (entity and checkbox), tables, and generic entities in addition to OCR text
Intelligent Document Quality Processor: Assesses the quality of documents based on their readability and provides a quality score
Document Splitter: Automatically splits documents based on logical boundaries

Document AI provides us with numerous specialized processors...

Creating custom Document AI processors

If we are unable to find a suitable prebuilt processor for our use case, Document AI Workbench lets us build and train our own tailored processors from scratch and with minimal effort. If we go to the Workbench tab inside Document AI, we’ll get the following options for creating a custom processor (see Figure 13.7):

Figure 13.7 – Document AI Workbench for creating custom model-based processors

In this exercise, we will work with the Custom Document Extractor solution to create a custom processor. Once we click on CREATE PROCESSOR, we will be able to find this processor within the My Processors tab. If we click on the processor, we will get options for training, evaluating, and testing our custom processor, as well as options for managing deployed versions of custom models. After training a version, we can also configure the Human-in-the-loop feature. See Figure 13.8 for these options:

Figure 13.8 – Custom Document AI processor details in the Google Cloud console UI

...

Summary

This chapter highlighted the fact that every business or company uses many forms of documents (such as emails, contracts, forms, PDFs, and images) to share and store information. Document AI is an end-to-end solution on Google Cloud that lets us extract this information in a structured way such that it can be readily used to train ML models or perform other downstream tasks to make a lot of value out of the information within these documents.

By completing this chapter, you should now be confident about Document AI and its importance for every business. You should also have a good understanding of prebuilt processors within Document AI and should be able to integrate them into their application easily. Finally, if prebuilt processors don’t fulfill your expectations, there are options to build custom processors to meet the goal of your use case.

We now have a good understanding of Document AI on Google Cloud. In the next chapter, we will learn about more Google...

The rest of the chapter is locked

You have been reading a chapter from

The Definitive Guide to Google Vertex AI

Published in: Dec 2023Publisher: PacktISBN-13: 9781801815260

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Jasmeet Bhatia

Jasmeet is a Machine Learning Architect with over 8 years of experience in Data Science and Machine Learning Engineering at Google and Microsoft, and overall has 17 years of experience in Product Engineering and Technology consulting at Deloitte, Disney, and Motorola. He has been involved in building technology solutions that focus on solving complex business problems by utilizing information and data assets. He has built high performing engineering teams, designed and built global scale AI/Machine Learning, Data Science, and Advanced analytics solutions for image recognition, natural language processing, sentiment analysis, and personalization.
Read more about Jasmeet Bhatia

Kartik Chaudhary

Kartik is an Artificial Intelligence and Machine Learning professional with 6+ years of industry experience in developing and architecting large scale AI/ML solutions using the technological advancements in the field of Machine Learning, Deep Learning, Computer Vision and Natural Language Processing. Kartik has filed 9 patents at the intersection of Machine Learning, Healthcare, and Operations. Kartik loves sharing knowledge, blogging, travel, and photography.
Read more about Kartik Chaudhary

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages