You're reading from Hands-On Computer Vision with Detectron2

Product typeBook

Published inApr 2023

Reading LevelBeginner

PublisherPackt

ISBN-139781800561625

Edition1st Edition

Languages

Python

Tools

PyTorch

Concepts

Computer Vision

Author (1)

Van Vung Pham

Image Data Augmentation Techniques

This chapter answers the questions of what, why, and how to perform image augmentation by providing a set of standard and state-of-the-art image augmentation techniques. Once you have foundational knowledge of image augmentation techniques, this chapter will introduce Detectron2’s image augmentation system, which has three main components: Transformation, Augmentation, and AugInput. It describes classes in these components and how they work together to perform image augmentation while training Detectron2 models.

By the end of this chapter, you will understand important image augmentation techniques, how they work, and why they help improve model performance. Additionally, you will be able to perform these image augmentations in Detectron2. Specifically, this chapter covers the following topics:

Image augmentation techniques
Detectron2’s image augmentation system:
- Transformation classes
- Augmentation classes
- The AugInput class...

Technical requirements

You must have completed Chapter 1 to have an appropriate development environment for Detectron2. All the code, datasets, and results are available in this book’s GitHub repository at https://github.com/PacktPublishing/Hands-On-Computer-Vision-with-Detectron2.

Image augmentation techniques

Image augmentation techniques help greatly improve the robustness and accuracy of computer vision deep learning models. Detectron2 and many other modern computer vision architectures use image augmentation. Therefore, it is essential to understand image augmentation techniques and how Detectron2 uses them. This section covers what image augmentations are, why they are important, and introduces popular methods to perform them (how). The next two sections explain how Detectron2 uses them during training and inferencing.

Why image augmentations?

Deep learning architectures with a small number of weights may not be accurate (bias issue). Therefore, modern architectures tend to be complex and have huge numbers of weights. Training these models often involves passing through the training datasets for several epochs; one epoch means the whole training dataset is passed through the model once. Therefore, the huge numbers of weights may mean the models tend...

Detectron2’s image augmentation system

Detectron2’s image augmentation system has three main groups of classes: Transformation, Augmentation, and AugInput. These components help augment images and their related annotations (for example, bounding boxes, segment masks, and key points). Additionally, this system allows you to apply a sequence of declarative augmentation statements and enables augmenting custom data types and custom operations. Figure 8.4 shows a simplified class diagram of Detectron2’s augmentation system:

Figure 8.4: Simplified class diagram of Detectron2’s augmentation system

The Transform and Augmentation classes are the bases for all the classes in their respective groups. Notably, the data format for boxes is in XYXY_ABS mode, which dictates the boxes to be in (x_min, y_min, x_max, y_max), specified in absolute pixels. Generally, subclasses of the Transform base class perform the deterministic changes of the...

Summary

This chapter introduced image augmentations and why it is essential to perform them in computer vision. Then, we covered common and state-of-the-art image augmentation techniques. After understanding the theoretical foundation, we looked at Detectron2’s image augmentation system, which has three main components, and their related classes: Transform, Augmentation, and AugInput. Detectron2 provides a declarative approach for applying existing augmentations conveniently.

The existing system supports augmentations on a single input, while several modern image augmentations require data from different inputs. Therefore, the next chapter will show you how to modify several Detectron2 data loader components so that you can apply modern image augmentation techniques. The next chapter also describes how to apply test time augmentations.

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Computer Vision with Detectron2

Published in: Apr 2023Publisher: PacktISBN-13: 9781800561625

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Van Vung Pham

Van Vung Pham is a passionate research scientist in machine learning, deep learning, data science, and data visualization. He has years of experience and numerous publications in these areas. He is currently working on projects that use deep learning to predict road damage from pictures or videos taken from roads. One of the projects uses Detectron2 and Faster R-CNN to predict and classify road damage and achieve state-of-the-art results for this task. Dr. Pham obtained his PhD from the Computer Science Department, at Texas Tech University, Lubbock, Texas, USA. He is currently an assistant professor at the Computer Science Department, Sam Houston State University, Huntsville, Texas, USA.
Read more about Van Vung Pham

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages