You're reading from Hands-On Computer Vision with Detectron2

Product typeBook

Published inApr 2023

Reading LevelBeginner

PublisherPackt

ISBN-139781800561625

Edition1st Edition

Languages

Python

Tools

PyTorch

Concepts

Computer Vision

Author (1)

Van Vung Pham

Fine-Tuning Object Detection Models

Detectron2 utilizes the concepts of anchors to improve its object detection accuracy by allowing object detection models to predict from a set of anchors instead of from scratch. The set of anchors has various sizes and ratios to reflect the shapes of the objects to be detected. Detectron2 uses two sets of hyperparameters called sizes and ratios to generate the initial set of anchors. Therefore, this chapter explains how Detectron2 processes its inputs and provides code to analyze the ground-truth boxes from a training dataset and find appropriate values for these anchor sizes and ratios.

Additionally, input image pixels’ means and standard deviations are crucial in training Detectron2 models. Specifically, Detectron2 uses these values to normalize the input images during training. Calculating these hyperparameters over the whole dataset at once is often impossible for large datasets. Therefore, this chapter provides the code to calculate...

Technical requirements

You should have completed Chapter 1 to have an appropriate development environment for Detectron2. All the code, datasets, and results are available on the GitHub repo of the book at https://github.com/PacktPublishing/Hands-On-Computer-Vision-with-Detectron2. It is highly recommended to download the code and follow along.

Important note

This chapter has code that includes random number generators. Therefore, several values produced in this chapter may differ from run to run. However, the output values should be similar, and the main concepts remain the same.

Setting anchor sizes and anchor ratios

Detectron2 implements Faster R-CNN for object detection tasks, and Faster R-CNN makes excellent use of anchors to allow the object detection model to predict from a fixed set of image patches instead of detecting them from scratch. Anchors have different sizes and ratios to accommodate the fact that the detecting objects are of different shapes. In other words, having a set of anchors closer to the conditions of the to-be-detected things would improve the prediction performance and training time.

Therefore, the following sections cover the steps to (1) explore how Detectron2 prepares the image data for images, (2) get a sample of data for some pre-defined iterations and extract the ground-truth bounding boxes from the sampled data, and finally, (3) utilize clustering and genetic algorithms to find the best set of sizes and ratios for training.

Preprocessing input images

We need to know the sizes and ratios of the ground-truth boxes in...

Setting pixel means and standard deviations

Input image pixels’ means and standard deviations are crucial in training Detectron2 models. Specifically, Detectron2 uses these values to normalize the input images. Detectron2 has two configuration parameters for these. They are cfg.MODEL.PIXEL_MEAN and cfg.MODEL.PIXEL_STD. By default, the common values for these two hyperparameters generated from the ImageNet dataset are [103.53, 116.28, 123.675] and [57.375, 57.120, 58.395]. These values are appropriate for most of the color images. However, this specific case has grayscale images with different values for pixel means and standard deviations. Therefore, producing these two sets of values from the training dataset would be beneficial. This task has two main stages: (1) preparing a data loader to load images and (2) creating a class to calculate running means and standard deviations.

Preparing a data loader

Detectron2’s data loader is iterable and can yield infinite...

Putting it all together

The code for training the custom model with the ability to perform evaluations and a hook to save the best model remains the same as in sw. However, the configuration should be as follows:

# Codes to generate cfg object are removed for space effc.
# Solver
cfg.SOLVER.IMS_PER_BATCH = 6
cfg.SOLVER.BASE_LR = 0.001
cfg.SOLVER.WARMUP_ITERS = 1000
cfg.SOLVER.MOMENTUM = 0.9
cfg.SOLVER.STEPS = (3000, 4000)
cfg.SOLVER.GAMMA = 0.5
cfg.SOLVER.NESTROV = False
cfg.SOLVER.MAX_ITER = 5000
# checkpoint
cfg.SOLVER.CHECKPOINT_PERIOD = 500
# anchors
cfg.MODEL.ANCHOR_GENERATOR.SIZES = [[68.33245953, 112.91302277,  89.55701886, 144.71037342,  47.77637482]]
cfg.MODEL.ANCHOR_GENERATOR.ASPECT_RATIOS = [[0.99819939, 0.78726896, 1.23598428]]
# pixels
cfg.MODEL.PIXEL_MEAN = [20.1962, 20.1962, 20.1962]
cfg.MODEL.PIXEL_STD = [39.5985, 39.5985, 39.5985]
# Other params similar to prev. chapter are removed here

Please refer to the complete Jupyter notebook on GitHub...

Summary

This chapter provides code and visualizations to explain how Detectron2 preprocesses its inputs. In addition, it provides code to analyze the ground-truth bounding boxes and uses a genetic algorithm to select suitable values for the anchor settings (anchor sizes and ratios). Additionally, it explains the steps to produce the input pixels’ means and standard deviations from the training dataset in a running (per batch) manner when the training dataset is large and does not fit in memory at once. Finally, this chapter also puts the configurations derived in the previous chapter and this chapter into training. The results indicate that with a few modifications, the accuracy improves without impacting training or inferencing time. The next chapter utilizes these training configurations and the image augmentation techniques (introduced next) and fine-tunes the Detectron2 model for predicting brain tumors.

The rest of the chapter is locked

You have been reading a chapter from

Hands-On Computer Vision with Detectron2

Published in: Apr 2023Publisher: PacktISBN-13: 9781800561625

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Van Vung Pham

Van Vung Pham is a passionate research scientist in machine learning, deep learning, data science, and data visualization. He has years of experience and numerous publications in these areas. He is currently working on projects that use deep learning to predict road damage from pictures or videos taken from roads. One of the projects uses Detectron2 and Faster R-CNN to predict and classify road damage and achieve state-of-the-art results for this task. Dr. Pham obtained his PhD from the Computer Science Department, at Texas Tech University, Lubbock, Texas, USA. He is currently an assistant professor at the Computer Science Department, Sam Houston State University, Huntsville, Texas, USA.
Read more about Van Vung Pham

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages