You're reading from Hands-On Computer Vision with Detectron2

Product typeBook

Published inApr 2023

Reading LevelBeginner

PublisherPackt

ISBN-139781800561625

Edition1st Edition

Languages

Python

Tools

PyTorch

Concepts

Computer Vision

Author (1)

Van Vung Pham

Training Custom Object Detection Models

This chapter starts with an introduction to the dataset and dataset preprocessing steps. It then continues with steps to train an object detection model using the default trainer and an option for saving and resuming training. This chapter then describes the steps to select the best from a set of trained models. It also provides the steps to perform object detection tasks on images with discussions on establishing appropriate inferencing thresholds and classification confidences. Additionally, it details the development process of a custom trainer by extending the default trainer and incorporating a hook into the training process.

By the end of this chapter, you will be able to train Detectron2’s models using the default trainer provided by Detectron2 and develop custom trainers to incorporate more customizations into the training process. Specifically, this chapter covers the following topics:

Processing data
Using the...

Technical requirements

You should have completed Chapter 1 to have an appropriate development environment for Detectron2. All the code, datasets, and results are available in the GitHub repository of the book (under the folder named Chapter05) at https://github.com/PacktPublishing/Hands-On-Computer-Vision-with-Detectron2. It is highly recommended to download the code and follow along.

Important note

This chapter executes code on a Google Colab instance. You should either map a Google Drive folder and store the output in a mapped folder or download the output and save it for future use. Alternatively, connect these Google Colab notebooks to a local instance, if you have one, to keep the results in permanent storage and utilize better computation resources.

Processing data

The following sections describe the dataset used in this chapter and discuss the typical steps for training Detectron2 models on custom datasets. The steps include exploring the dataset, converting the dataset into COCO format, registering the dataset with Detectron2, and finally, displaying some example images and the corresponding brain tumor labels.

The dataset

The dataset used is the brain tumor object detection dataset available from Kaggle (https://www.kaggle.com/datasets/davidbroberts/brain-tumor-object-detection-datasets), which is downloaded to the GitHub repository of this book to assure its accessibility. This dataset is chosen because medical image processing is a critical subfield in computer vision. At the same time, the task is challenging, and the number of images is appropriate for demonstration purposes.

Downloading and performing initial explorations

The first step in data processing is downloading and performing initial data explorations...

Using the default trainer

Detectron2 provides a default trainer class, which helps to train Detectron2 models on custom datasets conveniently. First, we download the datasets converted in the previous section and unzip them:

!wget -q https://github.com/PacktPublishing/Hands-On-Computer-Vision-with-Detectron2/raw/main/datasets/braintumors_coco.zip
!unzip -q braintumors_coco.zip

Next, install Detectron2 and register the train/test datasets using the exact code snippets provided in the previous section. Additionally, before training, run the following code snippet to prepare a logger that Detectron2 uses to log training/inferencing information:

from detectron2.utils.logger import setup_logger
logger = setup_logger()

After having the datasets registered and setting up the logger, the next step is getting a training configuration. Precisely, we set the output directory (where we will store the logging events and the trained models), the path to store the configuration file to...

Selecting the best model

Selecting the best model requires evaluation metrics. Therefore, we need to understand the common evaluation terminologies and evaluation metrics used for object detection tasks before choosing the best model. Additionally, after having the best model, this section also covers code to sample and visualize a few prediction results to qualitatively evaluate the chosen model.

Evaluation metrics for object detection models

Two main evaluation metrics are used for the object detection task: mAP@0.5 (or AP50) and F1-score (or F1). The former is the mean of average precisions (mAP) at the intersection over the union (IoU) threshold of 0.5 and is used to select the best models. The latter represents the harmonic means of precision and recall and is used to report how the chosen model performs on a specific dataset. The definitions of these two metrics use the computation of Precision and Recall:

Here, TP (for...

Developing a custom trainer

There are several reasons for developing a Detectron2 custom trainer. For instance, we may want to customize the dataset loader to incorporate more image augmentation techniques or to add evaluators to assess how the trained models perform during training. The following code snippet covers the source code to build a custom trainer for the latter, and Chapter 8 covers the code for the former:

from detectron2.engine import DefaultTrainer
from detectron2.evaluation import COCOEvaluator
class BrainTumorTrainer(DefaultTrainer):
  @classmethod
  def build_evaluator(cls, cfg, dataset_name, output_folder=None):
    if output_folder == None:
      output_folder = cfg.OUTPUT_DIR
    else:
      output_folder = os.path.join(cfg.OUTPUT_DIR,
                  ...

Utilizing the hook system

A hook system allows incorporating classes to execute several tasks on training events. A custom hook builds upon inheriting a base class from Detectron2 called detectron2.engine.HookBase. A hook allows the developer to execute tasks on four events by overriding the following methods:

before_training() to include tasks to be executed before the first training iteration
after_training() to include tasks to be executed after training completes
before_step() to include tasks to be executed before each training iteration
after_step() to include tasks to be executed after each training iteration

The following code snippet creates a hook to read the evaluation metrics generated by COCOEvaluator from the previously built custom trainer, keeps track of the best model with the highest mAP@0.5 value, and saves the model as model_best.pth:

# Some import statements are removed for space efficiency
class BestModelHook(HookBase):
  ...

Summary

This chapter discussed the steps to explore, process, and prepare a custom dataset for training object detection models using Detectron2. After processing the dataset, it is relatively easy to register the train, test, and evaluation data (if there is any) with Detectron2 and start training object detection models using the default trainer. The training process may result in many models. Therefore, this chapter provided the standard evaluation metrics and approaches for selecting the best model. The default trainer may meet the most common training requirements. However, in several cases, a custom trainer may be necessary to incorporate more customizations into the training process. This chapter provided code snippets to build a custom trainer that incorporates evaluations on the test set during training. It also provided a code snippet for a custom hook that extracts the evaluation metrics and stores the best model during training.

The next chapter, Chapter 6, uses TensorBoard...

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Computer Vision with Detectron2

Published in: Apr 2023Publisher: PacktISBN-13: 9781800561625

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Van Vung Pham

Van Vung Pham is a passionate research scientist in machine learning, deep learning, data science, and data visualization. He has years of experience and numerous publications in these areas. He is currently working on projects that use deep learning to predict road damage from pictures or videos taken from roads. One of the projects uses Detectron2 and Faster R-CNN to predict and classify road damage and achieve state-of-the-art results for this task. Dr. Pham obtained his PhD from the Computer Science Department, at Texas Tech University, Lubbock, Texas, USA. He is currently an assistant professor at the Computer Science Department, Sam Houston State University, Huntsville, Texas, USA.
Read more about Van Vung Pham

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages