Setting anchor sizes and anchor ratios
Detectron2 implements Faster R-CNN for object detection tasks, and Faster R-CNN makes excellent use of anchors to allow the object detection model to predict from a fixed set of image patches instead of detecting them from scratch. Anchors have different sizes and ratios to accommodate the fact that the detecting objects are of different shapes. In other words, having a set of anchors closer to the conditions of the to-be-detected things would improve the prediction performance and training time.
Therefore, the following sections cover the steps to (1) explore how Detectron2 prepares the image data for images, (2) get a sample of data for some pre-defined iterations and extract the ground-truth bounding boxes from the sampled data, and finally, (3) utilize clustering and genetic algorithms to find the best set of sizes and ratios for training.
Preprocessing input images
We need to know the sizes and ratios of the ground-truth boxes in...