You're reading from The Definitive Guide to Google Vertex AI

Product typeBook

Published inDec 2023

PublisherPackt

ISBN-139781801815260

Edition1st Edition

Concepts

Data Science

Authors (2):

Jasmeet Bhatia

Kartik Chaudhary

View More author details

Training Fully Custom ML Models with Vertex AI

In the previous chapters, we learned about training no-code (Auto-ML) as well as low-code (BQML) Machine Learning (ML) models with minimum technical expertise required. These solutions are really handy when it comes to solving common ML problems. However, sometimes the problem or data itself is so complex that it requires the development of custom Artificial Intelligence (AI) models, in most cases large deep learning-based models. Working on custom models requires a significant level of technical expertise in the fields of ML, deep learning, and AI. Sometimes, even with this expertise, it becomes really difficult to manage training and experiments of large-scale custom deep learning models due to a lack of resources, compute, and proper metadata tracking mechanisms.

To make the lives of ML developers easier, Vertex AI provides a managed environment for launching large-scale custom training jobs. Vertex AI-managed jobs let us track useful...

Technical requirements

This chapter requires basic-level knowledge of the deep learning framework TensorFlow and neural networks. Code artifacts can be found in the following GitHub repo – https://github.com/PacktPublishing/The-Definitive-Guide-to-Google-Vertex-AI/tree/main/Chapter07

Building a basic deep learning model with TensorFlow

TensorFlow, or TF for short, is an end-to-end platform for building ML models. The main focus of the TensorFlow framework is to simplify the development, training, evaluation, and deployment of deep neural networks. When it comes to working with unstructured data (such as images, videos, audio, etc.), neural network-based solutions have achieved significantly better results than traditional ML approaches that mostly rely on handcrafted features. Deep neural networks are good at understanding complex patterns from high-dimensional data points (for example, an image with millions of pixels). In this section, we will develop a basic neural network-based model using TensorFlow. In the next few sections, we will see how Vertex AI can help with setting up scalable and systemic training/tuning of such custom models.

Important Note

It is important to note that TensorFlow is not the only ML framework that Vertex AI supports. Vertex...

Packaging a model to submit it to Vertex AI as a training job

The previous section demonstrated a small image colorization experiment on a Vertex AI Workbench notebook. Notebooks are great for small-scale and quick experiments, but when it comes to large-scale experiments (with more compute and/or memory requirements), it is advised to launch them as a Vertex AI job and specify desired machine specifications (accelerators such as GPU or TPU if needed) for optimal experimentation. Vertex AI jobs also let us execute tons of experiments in parallel without waiting for the results of a single experiment. Experiment tracking is also quite easy with Vertex AI jobs, so it becomes easier to compare your latest experiments with past experiments with the help of saved metadata and the Vertex AI UI. Now, let’s use our model experimentation setup from the previous section and launch it as a Vertex AI training job.

Important note

Vertex AI jobs run in a containerized environment,...

Monitoring model training progress

In the previous section, we saw how easy it is to launch a Vertex AI custom training job with desired configurations and machine types. These Vertex AI training jobs are really useful for running large-scale experiments where training uses high compute (multiple GPUs or TPUs) and also may run for a few days. Such long-running experiments are not very feasible to run in a Jupyter Notebook-based environment. Another great thing about launching Vertex AI jobs is that all the metadata and lineage are tracked in a systematic way so that we can come back later and look into our past experiments and compare them with the latest ones in an easy and accurate way.

Another important aspect is monitoring the live progress of training jobs (including metrics such as loss and accuracy). For this purpose, we can easily set up Vertex AI TensorBoard within our Vertex AI job and track the progress in a near real-time fashion. In this section, we will set up a TensorBoard...

Evaluating trained models

In this section, we will take the already trained model from the previous section and launch a batch inference job on the test data. The first step here will be to load our test data into a Jupyter Notebook:

from io import BytesIO
import numpy as np
from tensorflow.python.lib.io import file_io
dest = 'gs://data-bucket-417812395597/'
test_x = np.load(BytesIO(file_io.read_file_to_string(dest+'test_x',\
    binary_mode=True)))
test_y = np.load(BytesIO(file_io.read_file_to_string(dest+'test_y',\
    binary_mode=True)))
print(test_x.shape, test_y.shape)

The next step is to create a JSON payload of instances from our test data and save it in a cloud storage location. The batch inference module will be able to read these instances and perform inference:

import json
BATCH_PREDICTION_INSTANCES_FILE = "batch_prediction_instances.jsonl"
BATCH_PREDICTION_GCS_SOURCE = (
  ...

Summary

In the chapter, we learned how to work with a Vertex AI-based managed training environment and launch custom training jobs. Launching custom training jobs on Vertex AI comes with a number of advantages, such as managed metadata tracking, no need to actively monitor jobs, and the ability to launch any number of experiments in parallel, choose your desired machine specifications to run your experiments, monitor training progress and results in near-real time fashion using the Cloud console UI, and run managed batch inference jobs on a saved model. It is also tighly integrated with other GCP products.

After reading this chapter, you should be able to develop and run custom deep learning models (using frameworks such as TensorFlow) on Vertex AI Workbench notebooks. Secondly, you should be able to launch long-running Vertex AI custom training jobs and also understand the advantages of the managed Vertex AI training framework. The managed Google Cloud console interface and TensorBoard...

The rest of the chapter is locked

You have been reading a chapter from

The Definitive Guide to Google Vertex AI

Published in: Dec 2023Publisher: PacktISBN-13: 9781801815260

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Authors (2)

Jasmeet Bhatia

Jasmeet is a Machine Learning Architect with over 8 years of experience in Data Science and Machine Learning Engineering at Google and Microsoft, and overall has 17 years of experience in Product Engineering and Technology consulting at Deloitte, Disney, and Motorola. He has been involved in building technology solutions that focus on solving complex business problems by utilizing information and data assets. He has built high performing engineering teams, designed and built global scale AI/Machine Learning, Data Science, and Advanced analytics solutions for image recognition, natural language processing, sentiment analysis, and personalization.
Read more about Jasmeet Bhatia

Kartik Chaudhary

Kartik is an Artificial Intelligence and Machine Learning professional with 6+ years of industry experience in developing and architecting large scale AI/ML solutions using the technological advancements in the field of Machine Learning, Deep Learning, Computer Vision and Natural Language Processing. Kartik has filed 9 patents at the intersection of Machine Learning, Healthcare, and Operations. Kartik loves sharing knowledge, blogging, travel, and photography.
Read more about Kartik Chaudhary

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages