Reader small image

You're reading from  Hands-On Computer Vision with Detectron2

Product typeBook
Published inApr 2023
Reading LevelBeginner
PublisherPackt
ISBN-139781800561625
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Van Vung Pham
Van Vung Pham
author image
Van Vung Pham

Van Vung Pham is a passionate research scientist in machine learning, deep learning, data science, and data visualization. He has years of experience and numerous publications in these areas. He is currently working on projects that use deep learning to predict road damage from pictures or videos taken from roads. One of the projects uses Detectron2 and Faster R-CNN to predict and classify road damage and achieve state-of-the-art results for this task. Dr. Pham obtained his PhD from the Computer Science Department, at Texas Tech University, Lubbock, Texas, USA. He is currently an assistant professor at the Computer Science Department, Sam Houston State University, Huntsville, Texas, USA.
Read more about Van Vung Pham

Right arrow

Applying Train-Time and Test-Time Image Augmentations

The previous chapter introduced the existing augmentation and transformation classes Detectron2 offers. This chapter introduces the steps to apply these existing classes to training. Additionally, Detectron2 offers many image augmentation classes. However, they all work on annotations from a single input at a time, while modern techniques may need to combine annotations from different inputs while creating custom augmentations. Therefore, this chapter also provides the foundation for Detectron2’s data loader component. Understanding this component helps explain how to apply existing image augmentations and modify existing codes to implement custom techniques that need to load data from different inputs. Finally, this chapter details the steps for applying image augmentations during test time to improve accuracy.

By the end of this chapter, you will be able to understand how Detectron2 loads its data, how to apply existing...

Technical requirements

You should have completed Chapter 1 to have an appropriate development environment for Detectron2. All the code, datasets, and results are available on the GitHub repo of the book at https://github.com/PacktPublishing/Hands-On-Computer-Vision-with-Detectron2.

The Detectron2 data loader

Applying augmentations in Detectron2 can be straightforward and complicated at the same time. It is relatively easy to use the declarative approach and apply existing transformations and augmentations provided by Detectron2, which should meet the most common needs. However, adding custom augmentations that require multiple data samples (e.g., MixUp and Mosaic) is a little complicated. This section describes how Detectron2 loads data and how to incorporate existing and custom data augmentations into training Detectron2 models. Figure 9.1 illustrates the steps and main components of the Detectron2 data loading system.

Figure 9.1: Loading data and data augmentations in Detectron2

Figure 9.1: Loading data and data augmentations in Detectron2

There are classes for Dataset, Sampler, Mapper, and Loader. The Dataset component normally stores a list of data items in JSON format. The Sampler component helps to randomly select one data item (dataset_dict) from the dataset. The selected data item has...

Applying existing image augmentation techniques

Augmentations are in the Mapper component. A Mapper receives a list of augmentations and applies them to the image and annotations accordingly. The following code snippet creates a Detectron2 trainer and specifies a list of existing augmentations to use:

class MyTrainer(DefaultTrainer):
  @classmethod
  def build_train_loader(cls, cfg):
    augs = []
    # Aug 1: Add RandomBrightness with 50% chance
    # Aug 2: Add ResizeShortestEdge
    # Aug 3: Add RandomFlipping
    mapper = DatasetMapper(cfg,
                           is_train       = True,
                 ...

Developing custom image augmentation techniques

Suppose the custom augmentation requires annotations loaded from one data sample (dataset_dict). In that case, it is relatively simple to implement the custom augmentation or transformation and incorporate it using the declarative approach described in the previous section. This section focuses on more complicated augmentation types that require loading and combining inputs from multiple samples. In these cases, we need to rewrite several parts of the Detectron2 data loader system. Thus, this section describes modifications to the Detectron2 data loader system and develops two custom image augmentations (MixUp and Mosaic) for illustration purposes.

Modifying the existing data loader

The following code snippet imports some required packages and creates an extended version (ExtendedAugInput) of the AugInput class that also enables passing a dataset_dict parameter. The reason is that augmentations such as MixUp and Mosaic may add annotations...

Applying test-time image augmentation techniques

Test-time augmentations (TTA) can improve prediction performance by providing different versions of the input image for predictions and performing non-maximum suppression (NMS) on the resulting predictions. Detectron2 provides two classes for this: DatasetMapperTTA and GeneralizedRCNNWithTTA. The DatasetMapperTTA class helps to map a dataset dictionary (a data item in JSON format) into the format expected by Detectron2 models with the opportunity to perform augmentations. The default augmentations used are ResizeShortestEdge and RandomFlip. The GeneralizedRCNNWithTTA class takes the original model and the Mapper object as inputs. It performs predictions on the augmented data and preprocesses the resulting outputs.

Let us use the code approach to explain these two classes. As a routine, we first install Detectron2, load the brain tumors dataset, and register the test dataset. Next, the following code snippet gets a pre-trained model...

Summary

This chapter described the steps to apply image augmentation techniques using Detectron2 at both train time and test time (inferencing time). Detectron2 provides a declarative approach to applying existing augmentations conveniently. However, the current system supports augmentations on a single input, while several modern image augmentations require data from different inputs. Therefore, this chapter described the Detectron2 data loader system and provided steps to modify several Detectron2 data loader components to enable applying modern image augmentation techniques such as MixUp and Mosaic that require multiple inputs. Lastly, this chapter also described the features in Detectron2 that allow for performing test-time augmentations.

Congratulations! You now understand the Detectron2 architecture for object detection models and should have mastered the steps to prepare data, train, and fine-tune Detectron2 object detection models. The following part of this book has a similar...

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Computer Vision with Detectron2
Published in: Apr 2023Publisher: PacktISBN-13: 9781800561625
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Van Vung Pham

Van Vung Pham is a passionate research scientist in machine learning, deep learning, data science, and data visualization. He has years of experience and numerous publications in these areas. He is currently working on projects that use deep learning to predict road damage from pictures or videos taken from roads. One of the projects uses Detectron2 and Faster R-CNN to predict and classify road damage and achieve state-of-the-art results for this task. Dr. Pham obtained his PhD from the Computer Science Department, at Texas Tech University, Lubbock, Texas, USA. He is currently an assistant professor at the Computer Science Department, Sam Houston State University, Huntsville, Texas, USA.
Read more about Van Vung Pham