Reader small image

You're reading from  Hands-On Computer Vision with Detectron2

Product typeBook
Published inApr 2023
Reading LevelBeginner
PublisherPackt
ISBN-139781800561625
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Van Vung Pham
Van Vung Pham
author image
Van Vung Pham

Van Vung Pham is a passionate research scientist in machine learning, deep learning, data science, and data visualization. He has years of experience and numerous publications in these areas. He is currently working on projects that use deep learning to predict road damage from pictures or videos taken from roads. One of the projects uses Detectron2 and Faster R-CNN to predict and classify road damage and achieve state-of-the-art results for this task. Dr. Pham obtained his PhD from the Computer Science Department, at Texas Tech University, Lubbock, Texas, USA. He is currently an assistant professor at the Computer Science Department, Sam Houston State University, Huntsville, Texas, USA.
Read more about Van Vung Pham

Right arrow

Training Custom Object Detection Models

This chapter starts with an introduction to the dataset and dataset preprocessing steps. It then continues with steps to train an object detection model using the default trainer and an option for saving and resuming training. This chapter then describes the steps to select the best from a set of trained models. It also provides the steps to perform object detection tasks on images with discussions on establishing appropriate inferencing thresholds and classification confidences. Additionally, it details the development process of a custom trainer by extending the default trainer and incorporating a hook into the training process.

By the end of this chapter, you will be able to train Detectron2’s models using the default trainer provided by Detectron2 and develop custom trainers to incorporate more customizations into the training process. Specifically, this chapter covers the following topics:

  • Processing data
  • Using the...

Technical requirements

You should have completed Chapter 1 to have an appropriate development environment for Detectron2. All the code, datasets, and results are available in the GitHub repository of the book (under the folder named Chapter05) at https://github.com/PacktPublishing/Hands-On-Computer-Vision-with-Detectron2. It is highly recommended to download the code and follow along.

Important note

This chapter executes code on a Google Colab instance. You should either map a Google Drive folder and store the output in a mapped folder or download the output and save it for future use. Alternatively, connect these Google Colab notebooks to a local instance, if you have one, to keep the results in permanent storage and utilize better computation resources.

Processing data

The following sections describe the dataset used in this chapter and discuss the typical steps for training Detectron2 models on custom datasets. The steps include exploring the dataset, converting the dataset into COCO format, registering the dataset with Detectron2, and finally, displaying some example images and the corresponding brain tumor labels.

The dataset

The dataset used is the brain tumor object detection dataset available from Kaggle (https://www.kaggle.com/datasets/davidbroberts/brain-tumor-object-detection-datasets), which is downloaded to the GitHub repository of this book to assure its accessibility. This dataset is chosen because medical image processing is a critical subfield in computer vision. At the same time, the task is challenging, and the number of images is appropriate for demonstration purposes.

Downloading and performing initial explorations

The first step in data processing is downloading and performing initial data explorations...

Using the default trainer

Detectron2 provides a default trainer class, which helps to train Detectron2 models on custom datasets conveniently. First, we download the datasets converted in the previous section and unzip them:

!wget -q https://github.com/PacktPublishing/Hands-On-Computer-Vision-with-Detectron2/raw/main/datasets/braintumors_coco.zip
!unzip -q braintumors_coco.zip

Next, install Detectron2 and register the train/test datasets using the exact code snippets provided in the previous section. Additionally, before training, run the following code snippet to prepare a logger that Detectron2 uses to log training/inferencing information:

from detectron2.utils.logger import setup_logger
logger = setup_logger()

After having the datasets registered and setting up the logger, the next step is getting a training configuration. Precisely, we set the output directory (where we will store the logging events and the trained models), the path to store the configuration file to...

Selecting the best model

Selecting the best model requires evaluation metrics. Therefore, we need to understand the common evaluation terminologies and evaluation metrics used for object detection tasks before choosing the best model. Additionally, after having the best model, this section also covers code to sample and visualize a few prediction results to qualitatively evaluate the chosen model.

Evaluation metrics for object detection models

Two main evaluation metrics are used for the object detection task: mAP@0.5 (or AP50) and F1-score (or F1). The former is the mean of average precisions (mAP) at the intersection over the union (IoU) threshold of 0.5 and is used to select the best models. The latter represents the harmonic means of precision and recall and is used to report how the chosen model performs on a specific dataset. The definitions of these two metrics use the computation of Precision and Recall:

Here, TP (for...

Developing a custom trainer

There are several reasons for developing a Detectron2 custom trainer. For instance, we may want to customize the dataset loader to incorporate more image augmentation techniques or to add evaluators to assess how the trained models perform during training. The following code snippet covers the source code to build a custom trainer for the latter, and Chapter 8 covers the code for the former:

from detectron2.engine import DefaultTrainer
from detectron2.evaluation import COCOEvaluator
class BrainTumorTrainer(DefaultTrainer):
  @classmethod
  def build_evaluator(cls, cfg, dataset_name, output_folder=None):
    if output_folder == None:
      output_folder = cfg.OUTPUT_DIR
    else:
      output_folder = os.path.join(cfg.OUTPUT_DIR,
                  ...

Utilizing the hook system

A hook system allows incorporating classes to execute several tasks on training events. A custom hook builds upon inheriting a base class from Detectron2 called detectron2.engine.HookBase. A hook allows the developer to execute tasks on four events by overriding the following methods:

  • before_training() to include tasks to be executed before the first training iteration
  • after_training() to include tasks to be executed after training completes
  • before_step() to include tasks to be executed before each training iteration
  • after_step() to include tasks to be executed after each training iteration

The following code snippet creates a hook to read the evaluation metrics generated by COCOEvaluator from the previously built custom trainer, keeps track of the best model with the highest mAP@0.5 value, and saves the model as model_best.pth:

# Some import statements are removed for space efficiency
class BestModelHook(HookBase):
  ...

Summary

This chapter discussed the steps to explore, process, and prepare a custom dataset for training object detection models using Detectron2. After processing the dataset, it is relatively easy to register the train, test, and evaluation data (if there is any) with Detectron2 and start training object detection models using the default trainer. The training process may result in many models. Therefore, this chapter provided the standard evaluation metrics and approaches for selecting the best model. The default trainer may meet the most common training requirements. However, in several cases, a custom trainer may be necessary to incorporate more customizations into the training process. This chapter provided code snippets to build a custom trainer that incorporates evaluations on the test set during training. It also provided a code snippet for a custom hook that extracts the evaluation metrics and stores the best model during training.

The next chapter, Chapter 6, uses TensorBoard...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Computer Vision with Detectron2
Published in: Apr 2023Publisher: PacktISBN-13: 9781800561625
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Van Vung Pham

Van Vung Pham is a passionate research scientist in machine learning, deep learning, data science, and data visualization. He has years of experience and numerous publications in these areas. He is currently working on projects that use deep learning to predict road damage from pictures or videos taken from roads. One of the projects uses Detectron2 and Faster R-CNN to predict and classify road damage and achieve state-of-the-art results for this task. Dr. Pham obtained his PhD from the Computer Science Department, at Texas Tech University, Lubbock, Texas, USA. He is currently an assistant professor at the Computer Science Department, Sam Houston State University, Huntsville, Texas, USA.
Read more about Van Vung Pham