Reader small image

You're reading from  Hands-On Computer Vision with Detectron2

Product typeBook
Published inApr 2023
Reading LevelBeginner
PublisherPackt
ISBN-139781800561625
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Van Vung Pham
Van Vung Pham
author image
Van Vung Pham

Van Vung Pham is a passionate research scientist in machine learning, deep learning, data science, and data visualization. He has years of experience and numerous publications in these areas. He is currently working on projects that use deep learning to predict road damage from pictures or videos taken from roads. One of the projects uses Detectron2 and Faster R-CNN to predict and classify road damage and achieve state-of-the-art results for this task. Dr. Pham obtained his PhD from the Computer Science Department, at Texas Tech University, Lubbock, Texas, USA. He is currently an assistant professor at the Computer Science Department, Sam Houston State University, Huntsville, Texas, USA.
Read more about Van Vung Pham

Right arrow

Image Data Augmentation Techniques

This chapter answers the questions of what, why, and how to perform image augmentation by providing a set of standard and state-of-the-art image augmentation techniques. Once you have foundational knowledge of image augmentation techniques, this chapter will introduce Detectron2’s image augmentation system, which has three main components: Transformation, Augmentation, and AugInput. It describes classes in these components and how they work together to perform image augmentation while training Detectron2 models.

By the end of this chapter, you will understand important image augmentation techniques, how they work, and why they help improve model performance. Additionally, you will be able to perform these image augmentations in Detectron2. Specifically, this chapter covers the following topics:

  • Image augmentation techniques
  • Detectron2’s image augmentation system:
    • Transformation classes
    • Augmentation classes
    • The AugInput class...

Technical requirements

You must have completed Chapter 1 to have an appropriate development environment for Detectron2. All the code, datasets, and results are available in this book’s GitHub repository at https://github.com/PacktPublishing/Hands-On-Computer-Vision-with-Detectron2.

Image augmentation techniques

Image augmentation techniques help greatly improve the robustness and accuracy of computer vision deep learning models. Detectron2 and many other modern computer vision architectures use image augmentation. Therefore, it is essential to understand image augmentation techniques and how Detectron2 uses them. This section covers what image augmentations are, why they are important, and introduces popular methods to perform them (how). The next two sections explain how Detectron2 uses them during training and inferencing.

Why image augmentations?

Deep learning architectures with a small number of weights may not be accurate (bias issue). Therefore, modern architectures tend to be complex and have huge numbers of weights. Training these models often involves passing through the training datasets for several epochs; one epoch means the whole training dataset is passed through the model once. Therefore, the huge numbers of weights may mean the models tend...

Detectron2’s image augmentation system

Detectron2’s image augmentation system has three main groups of classes: Transformation, Augmentation, and AugInput. These components help augment images and their related annotations (for example, bounding boxes, segment masks, and key points). Additionally, this system allows you to apply a sequence of declarative augmentation statements and enables augmenting custom data types and custom operations. Figure 8.4 shows a simplified class diagram of Detectron2’s augmentation system:

Figure 8.4: Simplified class diagram of Detectron2’s augmentation system

Figure 8.4: Simplified class diagram of Detectron2’s augmentation system

The Transform and Augmentation classes are the bases for all the classes in their respective groups. Notably, the data format for boxes is in XYXY_ABS mode, which dictates the boxes to be in (x_min, y_min, x_max, y_max), specified in absolute pixels. Generally, subclasses of the Transform base class perform the deterministic changes of the...

Summary

This chapter introduced image augmentations and why it is essential to perform them in computer vision. Then, we covered common and state-of-the-art image augmentation techniques. After understanding the theoretical foundation, we looked at Detectron2’s image augmentation system, which has three main components, and their related classes: Transform, Augmentation, and AugInput. Detectron2 provides a declarative approach for applying existing augmentations conveniently.

The existing system supports augmentations on a single input, while several modern image augmentations require data from different inputs. Therefore, the next chapter will show you how to modify several Detectron2 data loader components so that you can apply modern image augmentation techniques. The next chapter also describes how to apply test time augmentations.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Computer Vision with Detectron2
Published in: Apr 2023Publisher: PacktISBN-13: 9781800561625
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Van Vung Pham

Van Vung Pham is a passionate research scientist in machine learning, deep learning, data science, and data visualization. He has years of experience and numerous publications in these areas. He is currently working on projects that use deep learning to predict road damage from pictures or videos taken from roads. One of the projects uses Detectron2 and Faster R-CNN to predict and classify road damage and achieve state-of-the-art results for this task. Dr. Pham obtained his PhD from the Computer Science Department, at Texas Tech University, Lubbock, Texas, USA. He is currently an assistant professor at the Computer Science Department, Sam Houston State University, Huntsville, Texas, USA.
Read more about Van Vung Pham