Preface
Computer vision takes part and has become a critical success factor in many modern businesses such as automobile, robotics, manufacturing, and biomedical image processing – and its market is growing rapidly. This book will help you explore Detectron2. It is the next-generation library that provides cutting-edge computer vision algorithms. Many research and practical projects at Facebook (now Meta) use it as a library to support computer vision tasks. Its models can be exported to TorchScript and Open Neural Network Exchange (ONNX) format for deployments into server production environments (such as C++ runtime), browsers, and mobile devices.
By utilizing code and visualizations, this book will guide you on using existing models in Detectron2 for computer vision tasks (object detection, instance segmentation, key-point detection, semantic detection, and panoptic segmentation). It also covers theories and visualizations of Detectron2’s architectures and how each module in Detectron2 works. This book walks you through two complete hands-on, real-life projects (preparing data, training models, fine-tuning models, and deployments) for object detection and instance segmentation of brain tumors using Detectron2.
The data preparation section discusses common sources of datasets for computer vision applications and tools to collect and label data. It also describes common image data annotation formats and codes to convert from different formats to the one Detectron2 supports. The training model section guides the steps to prepare the configuration file, load pre-trained weights for transfer learning (if necessary), and modify the default trainer to meet custom business requirements.
The fine-tuning model section includes inspecting training results using TensorBoard and optimizing Detectron2 solvers. It also provides a primer to common and cutting-edge image augmentation techniques and how to use existing Detectron2 image augmentation techniques or to build and apply custom image augmentation techniques at training and testing time. There are also techniques to fine-tune object detection models, such as computing appropriate configurations for generating anchors (sizes and ratios of the anchors) or means or standard deviations of the pixel values from custom datasets. For instance segmentation task, this book also discusses the use of PointRend to improve the quality of the boundaries of the detected instances.
This book also covers steps for deploying Detectron2 models into production and developing Detectron2 applications for mobile devices. Specifically, it provides the model formats and platforms that Detectron2 supports, such as TorchScript and ONNX formats. It provides the code to convert Detectron2 into these formats models using tracing and scripting approaches. Additionally, code snippets illustrate how to deploy Detectron2 models into C++ and browser environments. Finally, this book also discusses D2Go, a platform to train, fine-tune, and quantize computer visions so they can be deployable to mobile and edge devices with low-computation resource awareness.
Through this book, you will find that Detectron2 is a valuable framework for anyone looking to build robust computer vision applications.