You're reading from Hands-On Computer Vision with Detectron2

Product typeBook

Published inApr 2023

Reading LevelBeginner

PublisherPackt

ISBN-139781800561625

Edition1st Edition

Languages

Python

Tools

PyTorch

Concepts

Computer Vision

Author (1)

Van Vung Pham

Applying Train-Time and Test-Time Image Augmentations

The previous chapter introduced the existing augmentation and transformation classes Detectron2 offers. This chapter introduces the steps to apply these existing classes to training. Additionally, Detectron2 offers many image augmentation classes. However, they all work on annotations from a single input at a time, while modern techniques may need to combine annotations from different inputs while creating custom augmentations. Therefore, this chapter also provides the foundation for Detectron2’s data loader component. Understanding this component helps explain how to apply existing image augmentations and modify existing codes to implement custom techniques that need to load data from different inputs. Finally, this chapter details the steps for applying image augmentations during test time to improve accuracy.

By the end of this chapter, you will be able to understand how Detectron2 loads its data, how to apply existing...

Technical requirements

You should have completed Chapter 1 to have an appropriate development environment for Detectron2. All the code, datasets, and results are available on the GitHub repo of the book at https://github.com/PacktPublishing/Hands-On-Computer-Vision-with-Detectron2.

The Detectron2 data loader

Applying augmentations in Detectron2 can be straightforward and complicated at the same time. It is relatively easy to use the declarative approach and apply existing transformations and augmentations provided by Detectron2, which should meet the most common needs. However, adding custom augmentations that require multiple data samples (e.g., MixUp and Mosaic) is a little complicated. This section describes how Detectron2 loads data and how to incorporate existing and custom data augmentations into training Detectron2 models. Figure 9.1 illustrates the steps and main components of the Detectron2 data loading system.

Figure 9.1: Loading data and data augmentations in Detectron2

There are classes for Dataset, Sampler, Mapper, and Loader. The Dataset component normally stores a list of data items in JSON format. The Sampler component helps to randomly select one data item (dataset_dict) from the dataset. The selected data item has...

Applying existing image augmentation techniques

Augmentations are in the Mapper component. A Mapper receives a list of augmentations and applies them to the image and annotations accordingly. The following code snippet creates a Detectron2 trainer and specifies a list of existing augmentations to use:

class MyTrainer(DefaultTrainer):
  @classmethod
  def build_train_loader(cls, cfg):
    augs = []
    # Aug 1: Add RandomBrightness with 50% chance
    # Aug 2: Add ResizeShortestEdge
    # Aug 3: Add RandomFlipping
    mapper = DatasetMapper(cfg,
                           is_train       = True,
                 ...

Developing custom image augmentation techniques

Suppose the custom augmentation requires annotations loaded from one data sample (dataset_dict). In that case, it is relatively simple to implement the custom augmentation or transformation and incorporate it using the declarative approach described in the previous section. This section focuses on more complicated augmentation types that require loading and combining inputs from multiple samples. In these cases, we need to rewrite several parts of the Detectron2 data loader system. Thus, this section describes modifications to the Detectron2 data loader system and develops two custom image augmentations (MixUp and Mosaic) for illustration purposes.

Modifying the existing data loader

The following code snippet imports some required packages and creates an extended version (ExtendedAugInput) of the AugInput class that also enables passing a dataset_dict parameter. The reason is that augmentations such as MixUp and Mosaic may add annotations...

Applying test-time image augmentation techniques

Test-time augmentations (TTA) can improve prediction performance by providing different versions of the input image for predictions and performing non-maximum suppression (NMS) on the resulting predictions. Detectron2 provides two classes for this: DatasetMapperTTA and GeneralizedRCNNWithTTA. The DatasetMapperTTA class helps to map a dataset dictionary (a data item in JSON format) into the format expected by Detectron2 models with the opportunity to perform augmentations. The default augmentations used are ResizeShortestEdge and RandomFlip. The GeneralizedRCNNWithTTA class takes the original model and the Mapper object as inputs. It performs predictions on the augmented data and preprocesses the resulting outputs.

Let us use the code approach to explain these two classes. As a routine, we first install Detectron2, load the brain tumors dataset, and register the test dataset. Next, the following code snippet gets a pre-trained model...

Summary

This chapter described the steps to apply image augmentation techniques using Detectron2 at both train time and test time (inferencing time). Detectron2 provides a declarative approach to applying existing augmentations conveniently. However, the current system supports augmentations on a single input, while several modern image augmentations require data from different inputs. Therefore, this chapter described the Detectron2 data loader system and provided steps to modify several Detectron2 data loader components to enable applying modern image augmentation techniques such as MixUp and Mosaic that require multiple inputs. Lastly, this chapter also described the features in Detectron2 that allow for performing test-time augmentations.

Congratulations! You now understand the Detectron2 architecture for object detection models and should have mastered the steps to prepare data, train, and fine-tune Detectron2 object detection models. The following part of this book has a similar...

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Computer Vision with Detectron2

Published in: Apr 2023Publisher: PacktISBN-13: 9781800561625

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Van Vung Pham

Van Vung Pham is a passionate research scientist in machine learning, deep learning, data science, and data visualization. He has years of experience and numerous publications in these areas. He is currently working on projects that use deep learning to predict road damage from pictures or videos taken from roads. One of the projects uses Detectron2 and Faster R-CNN to predict and classify road damage and achieve state-of-the-art results for this task. Dr. Pham obtained his PhD from the Computer Science Department, at Texas Tech University, Lubbock, Texas, USA. He is currently an assistant professor at the Computer Science Department, Sam Houston State University, Huntsville, Texas, USA.
Read more about Van Vung Pham

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages