Home Data Hands-On Computer Vision with Detectron2

Hands-On Computer Vision with Detectron2

By Van Vung Pham
books-svg-icon Book
eBook $35.99 $24.99
Print $44.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $35.99 $24.99
Print $44.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    Chapter 1: An Introduction to Detectron2 and Computer Vision Tasks
About this book
Computer vision is a crucial component of many modern businesses, including automobiles, robotics, and manufacturing, and its market is growing rapidly. This book helps you explore Detectron2, Facebook's next-gen library providing cutting-edge detection and segmentation algorithms. It’s used in research and practical projects at Facebook to support computer vision tasks, and its models can be exported to TorchScript or ONNX for deployment. The book provides you with step-by-step guidance on using existing models in Detectron2 for computer vision tasks (object detection, instance segmentation, key-point detection, semantic detection, and panoptic segmentation). You’ll get to grips with the theories and visualizations of Detectron2’s architecture and learn how each module in Detectron2 works. As you advance, you’ll build your practical skills by working on two real-life projects (preparing data, training models, fine-tuning models, and deployments) for object detection and instance segmentation tasks using Detectron2. Finally, you’ll deploy Detectron2 models into production and develop Detectron2 applications for mobile devices. By the end of this deep learning book, you’ll have gained sound theoretical knowledge and useful hands-on skills to help you solve advanced computer vision tasks using Detectron2.
Publication date:
April 2023
Publisher
Packt
Pages
318
ISBN
9781800561625

 

An Introduction to Detectron2 and Computer Vision Tasks

This chapter introduces Detectron2, its architectures, and the computer vision (CV) tasks that Detectron2 can perform. In other words, this chapter discusses what CV tasks Detectron2 can perform and why we need them. Additionally, this chapter provides the steps to set up environments for developing CV applications using Detectron2 locally or on the cloud using Google Colab.

By the end of this chapter, you will understand the main CV tasks (e.g, object detection, instance segmentation, keypoint detection, semantic segmentation, and panoptic segmentation); know how Detectron2 works and what it can do to help you tackle CV tasks using deep learning; and be able to set up local and cloud environments for developing Detectron2 applications.

Specifically, this chapter covers the following topics:

  • Computer vision tasks
  • Introduction to Detectron2 and its architecture
  • Detectron2 development environments
 

Technical requirements

Detectron2 CV applications are built on top of PyTorch. Therefore, a compatible version of PyTorch is expected to run the code examples in this chapter. Later sections of this chapter will provide setup instructions specifically for Detectron2. All the code, datasets, and respective results are available on the GitHub page of the book at https://github.com/PacktPublishing/Hands-On-Computer-Vision-with-Detectron2. It is highly recommended to download the code and follow along.

 

Computer vision tasks

Deep learning achieves state-of-the-art results in many CV tasks. The most common CV task is image classification, in which a deep learning model gives a class label for a given image. However, recent advancements in deep learning allow computers to perform more advanced vision tasks. There are many of these advanced vision tasks.

However, this book focuses on more common and important ones, including object detection, instance segmentation, keypoint detection, semantic segmentation, and panoptic segmentation. It might be challenging for readers to differentiate between these tasks. Figure 1.1 depicts the differences between them. This section outlines what they are and when to use them, and the rest of the book focuses on how to implement these tasks using Detectron2. Let’s get started!

Figure 1.1: Common computer vision tasks

Figure 1.1: Common computer vision tasks

Object detection

Object detection generally includes object localization and classification. Specifically, deep learning models for this task predict where objects of interest are in an image by applying the bounding boxes around these objects (localization). Furthermore, these models also classify the detected objects into types of interest (classification).

One example of this task is specifying people in pictures and applying bounding boxes to the detected humans (localization only), as shown in Figure 1.1 (b). Another example is to detect road damage from a recorded road image by providing bounding boxes to the damage (localization) and further classifying the damage into types such as longitudinal cracks, traverse cracks, alligator cracks, and potholes (classification).

Instance segmentation

Like object detection, instance segmentation also involves object localization and classification. However, instance segmentation takes things one step further while localizing the detected objects of interest.

Specifically, besides classification, models for this task localize the detected objects at the pixel level. In other words, it identifies all the pixels of each detected object. Instance segmentation is needed in applications that require shapes of the detected objects in images and need to track every individual object. Figure 1.1 (c) shows the instance segmentation result on the input image in Figure 1.1 (a). Specifically, besides the bounding boxes, every pixel of each person is also highlighted.

Keypoint detection

Besides detecting objects, keypoint detection also indicates important parts of the detected objects called keypoints. These keypoints describe the detected object’s essential trait. This trait is often invariant to image rotation, shrinkage, translation, or distortion. For instance, the keypoints of humans include the eyes, nose, shoulders, elbows, hands, knees, and feet. Keypoint detection is important for applications such as action estimation, pose detection, or face detection. Figure 1.1 (d) shows the keypoint detection result on the input image in Figure 1.1 (a). Specifically, besides the bounding boxes, it highlights all keypoints for every detected individual.

Semantic segmentation

A semantic segmentation task does not detect specific instances of objects but classifies each pixel in an image into some classes of interest. For instance, a model for this task classifies regions of images into pedestrians, roads, cars, trees, buildings, and the sky in a self-driving car application. This task is important when providing a broader view of groups of objects with different classes (i.e., a higher level of understanding of the image). Specifically, if individual class instances are in one region, they are grouped into one mask instead of having a different mask for each individual.

One example of the application of semantic segmentation is to segment the images into foreground objects and background objects (e.g., to blur the background and provide a more artistic look for a portrait image). Figure 1.1 (e) shows the semantic segmentation result on the input image in Figure 1.1 (a). Specifically, the input picture is divided into regions classified as things (people or front objects) and background objects such as the sky, a mountain, dirt, grass, and a tree.

Panoptic segmentation

Panoptic literally means “everything visible in the image”. In other words, it can be viewed as combining common CV tasks such as instance segmentation and semantic segmentation. It helps to show the unified and global view of segmentation. Generally, it classifies objects in an image into foreground objects (that have proper geometries) and background objects (that do not have appropriate geometries but are textures or materials).

Examples of foreground objects include people, animals, and cars. Likewise, examples of background objects include the sky, dirt, trees, mountains, and grass. Different from semantic segmentation, panoptic segmentation does not group consecutive individual objects of the same class into one region. Figure 1.1 (f) shows the panoptic segmentation result on the input image in Figure 1.1 (a).

Specifically, it looks similar to the semantic segmentation result, except it highlights the individual instances separately.

Important note – other CV tasks

There are other advanced CV projects developed on top of Detectron2, such as DensePose and PointRend. However, this book focuses on developing CV applications for the more common ones, including object detection, instance segmentation, keypoint detection, semantic segmentation, and panoptic segmentation in Chapter 2. Furthermore, Part 2 and Part 3 of this book further explore developing custom CV applications for the two most important tasks (object detection and instance segmentation). There is also a section that describes how to use PointRend to improve instance segmentation quality. Additionally, it is relatively easy to expand the code for other tasks once you understand these tasks.

Let’s get started by getting to know Detectron2 and its architecture!

 

An introduction to Detectron2 and its architecture

Detectron2 is Facebook (now Meta) AI Research’s open source project. It is a next-generation library that provides cutting-edge detection and segmentation algorithms. Many research and practical projects at Facebook use it as a library to support implementing CV tasks. The following sections introduce Detectron2 and provide an overview of its architecture.

Introducing Detectron2

Detectron2 implements state-of-the-art detection algorithms, such as Mask R-CNN, RetinaNet, Faster R-CNN, RPN, TensorMask, PointRend, DensePose, and more. The question that immediately comes to mind after this statement is, why is it better if it re-implements existing cutting-edge algorithms? The answer is that Detectron2 has the advantages of being faster, more accurate, modular, customizable, and built on top of PyTorch.

Specifically, it is faster and more accurate because while reimplementing the cutting-edge algorithms, there is the chance that Detectron2 will find suboptimal implementation parts or obsolete features from older versions of these algorithms and re-implement them. It is modular, or it divides its implementation into sub-parts. The parts include the input data, backbone network, region proposal heads, and prediction heads (the next section covers more information about these components). It is customizable, meaning its components have built-in implementations, but they can be customized by calling new implementations. Finally, it is built on top of PyTorch, meaning that many developer resources are available online to help develop applications with Detectron2.

Furthermore, Detectron2 provides pre-trained models with state-of-the-art detection results for CV tasks. These models were trained with many images on high computation resources at the Facebook research lab that might not be available in other institutions.

These pre-trained models are published on its Model Zoo and are free to use: https://github.com/facebookresearch/detectron2/blob/main/MODEL_ZOO.md.

These pre-trained models help developers develop typical CV applications quickly without collecting, preparing many images, or requiring high computation resources to train new models. However, suppose there is a need for developing a CV task on a specific domain with a custom dataset. In that case, these existing models can be the starting weights, and the whole Detectron2 model can be trained again on the custom dataset.

Finally, we can convert Detectron2 models into deployable artifacts. Precisely, we can convert Detectron2 models into standard file formats of standard deep learning frameworks such as TorchScript, Caffe2 protobuf, and ONNX. These files can then be deployed to their corresponding runtimes, such as PyTorch, Caffe2, and ONNX Runtime. Furthermore, Facebook AI Research also published Detectron2Go (D2Go), a platform where developers can take their Detectron2 development one step further and create models optimized for mobile devices.

In summary, Detectron2 implements cutting-edge detection algorithms with the advantage of being fast, accurate, modular, and built on top of PyTorch. Detectron2 also provides pre-trained models so users can get started and quickly build CV applications with state-of-the-art results. It is also customizable, so users can change its components or train CV applications on a custom business domain. Furthermore, we can export Detectron2 into scripts supported by standard deep learning framework runtimes. Additionally, initial research called Detectron2Go supports developing Detectron2 applications for edge devices.

In the next section, we will look into Detectron2 architecture to understand how it works and the possibilities of customizing each of its components.

Detectron2 architecture

Figure 1.2: The main components of Detectron2

Figure 1.2: The main components of Detectron2

Detectron2 has a modular architecture. Figure 1.2 depicts the four main modules in a standard Detectron2 application. The first module is for registering input data (Input Data).

The second module is the backbone to extract image features (Backbone), followed by the third one for proposing regions with and without objects to be fed to the next training stage (Region Proposal). Finally, the last module uses appropriate heads (such as detection heads, instance segmentation heads, keypoint heads, semantic segmentation heads, or panoptic heads) to predict the regions with objects and classify detected objects into classes. Chapter 3 to Chapter 5 discuss these components for building a CV application for object detection tasks, and Chapter 10 and Chapter 11 detail these components for segmentation tasks. The following sections briefly discuss these components in general.

The input data module

The input data module is designed to load data in large batches from hard drives with optimization techniques such as caching and multi-workers. Furthermore, it is relatively easy to plug data augmentation techniques into a data loader for this module. Additionally, it is designed to be customizable so that users can register their custom datasets. The following is the typical syntax for assigning a custom dataset to train a Detectron2 model using this module:

DatasetRegistry.register(
    'my_dataset',
    load_my_dataset
)

The backbone module

The backbone module extracts features from the input images. Therefore, this module often uses a cutting-edge convolutional neural network such as ResNet or ResNeXt. This module can be customized to call any standard convolutional neural network that performs well in an image classification task of interest. Notably, this module has a great deal of knowledge about transfer learning. Specifically, we can use those pre-trained models here if we want to use a state-of-the-art convolution neural network that works well with large image datasets such as ImageNet. Otherwise, we can choose those simple networks for this module to increase performance (training and prediction time) with the accuracy trade-off. Chapter 2 will discuss selecting appropriate pre-trained models on the Detectron2 Model Zoo for common CV tasks.

The following code snippet shows the typical syntax for registering a custom backbone network to train the Detectron2 model using this module:

@BACKBONE_REGISTRY.register()
class CustomBackbone(Backbone):
    pass

The region proposal module

The next module is the region proposal module (Region Proposal). This module accepts the extracted features from the backbone and predicts or proposes image regions (with location specifications) and scores to indicate whether the regions contain objects (with objectness scores). The objectness score of a proposed region may be 0 (for not having an object or being background) or 1 (for being sure that there is an object of interest in the predicted region). Notably, this object score is not about the probability of being a class of interest but simply whether the region contains an object (of any class) or not (background).

This module is set with a default Region Proposal Network (RPN). However, replacing this network with a custom one is relatively easy. The following is the typical syntax for registering a custom RPN to train the Detectron2 model using this module:

@ROI_BOX_HEAD_REGISTRY.register()
class CustomBoxHead(nn.Module):
    pass

Region of interest module

The last module is the place for the region of interest (RoI) heads. Depending on the CV tasks, we can select appropriate heads for this module, such as detection heads, segmentation heads, keypoint heads, or semantic segmentation heads. For instance, the detection heads accept the region proposals and the input features of the proposed regions and pass them through a fully connected network, with two separate heads for prediction and classification. Specifically, one head is used to predict bounding boxes for objects, and another is for classifying the detected bounding boxes into corresponding classes.

On the other hand, semantic segmentation heads also use convolutional neural network heads to classify each pixel into one of the classes of interest. The following is the typical syntax for registering custom region of interest heads to train the Detectron2 model using this module:

@ROI_HEAD_REGISTRY.register()
class CustomHeads(StandardROIHeads):
    pass

Now that you have an understanding of Detectron2 and its architecture, let's prepare development environments for developing Detectron2 applications.

 

Detectron2 development environments

Now, we understand the advanced CV tasks and how Detectron2 helps to develop applications for these tasks. It is time to start developing Detectron2 applications. This section provides steps to set up Detectron2 development environments on the cloud using Google Colab, a local environment, or a hybrid approach connecting Google Colab to a locally hosted runtime.

Cloud development environment for Detectron2 applications

Google Colab or Colaboratory (https://colab.research.google.com) is a cloud platform that allows you to write and execute Python code from your web browser. It enables users to start developing deep learning applications with zero configuration because most common machine learning and deep learning packages, such as PyTorch and TensorFlow, are pre-installed. Furthermore, users will have access to GPUs free of charge. Even with the free plan, users have access to a computation resource that is relatively better than a standard personal computer. Users can pay a small amount for Pro or Pro+ with higher computation resources if needed. Additionally, as its name indicates, it is relatively easy to collaborate on Google Colab, and it is easy to share Google Colab files and projects.

Deep learning models for CV tasks work with many images; thus, GPUs significantly speed up the training and inferencing time. However, by default, Google Colab does not enable GPUs' runtime. Therefore, users should enable the GPU hardware accelerator before installing Detectron2 or training Detectron2 applications. This step is to select GPU from the Hardware accelerator drop-down menu found under Runtime | Change runtime type, as shown in Figure 1.3:

Figure 1.3: Select GPU for Hardware accelerator

Figure 1.3: Select GPU for Hardware accelerator

Detectron2 has a dedicated tutorial on how to install Detectron2 on Google Colab. However, this section discusses each step and gives further details about these. First, Detectron2 is built on top of PyTorch, so we need to have PyTorch installed. By default, Google Colab runtime already installs PyTorch. So, you can use the following snippet to install Detectron2 on Google Colab:

!python -m pip install \
'git+https://github.com/facebookresearch/detectron2.git'

If you have an error message such as the following one, it is safe to ignore it and proceed:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
flask 1.1.4 requires click<8.0,>=5.1, but you have click 8.1.3 which is incompatible.

However, if you face problems such as PyTorch versions on Google Colab, they may not be compatible with Detectron2. Then, you can install Detectron2 for specific versions of PyTorch and CUDA. You can use the following snippet to get PyTorch and CUDA versions:

import torch
TORCH_VERSION = ".".join(torch.__version__.split(".")[:2])
CUDA_VERSION = torch.__version__.split("+")[-1]
print("torch: ", TORCH_VERSION, "; cuda: ", CUDA_VERSION)

After understanding the PyTorch and CUDA versions, you can use the following snippet to install Detectron2. Please remember to replace TORCH_VERSION and CUDA_VERSION with the values found in the previous snippet:

!python -m pip install detectron2 -f \
https://dl.fbaipublicfiles.com/detectron2/wheels/{TORCH_VERSION}/{CUDA_VERSION}/index.html

Here is an example of such an installation command for CUDA version 11.3 and PyTorch version 1.10:

!python -m pip install detectron2 -f \
https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html

If you face an error such as the following, it means that there is no matching Detectron2 distribution for the current versions of PyTorch and CUDA:

ERROR: Could not find a version that satisfies the requirement detectron2 (from versions: none)
ERROR: No matching distribution found for detectron2

In this case, you can visit the Detectron2 installation page to find the distributions compatible with the current PyTorch and CUDA versions. This page is available at https://detectron2.readthedocs.io/en/latest/tutorials/install.html.

Figure 1.4 shows the current Detectron2 distributions with corresponding CUDA/CPU and PyTorch versions:

Figure 1.4: Current Detectron2 distributions for corresponding CUDA/CPU and PyTorch versions

Figure 1.4: Current Detectron2 distributions for corresponding CUDA/CPU and PyTorch versions

Suppose Detectron2 does not have a distribution that matches your current CUDA and PyTorch versions. Then, there are two options. The first option is to select the Detectron2 version with CUDA and PyTorch versions that are closest to the ones that you have. This approach should generally work. Otherwise, you can install the CUDA and PyTorch versions that Detectron2 supports.

Finally, you can use the following snippet to check the installed Detectron2 version:

import detectron2
print(detectron2.__version__)

Congratulations! You are now ready to develop CV applications using Detectron2 on Google Colab. Read on if you want to create Detectron2 applications on a local machine. Otherwise, you can go to Chapter 2 to start developing Detectron2 CV applications.

Local development environment for Detectron2 applications

Google Colab is an excellent cloud environment to quickly start building deep learning applications. However, it has several limitations. For instance, the free Google Colab plan may not have enough RAM and GPU resources for large projects. Another limitation is that your runtime may terminate if your kernel is idle for a while. Even in the purchased Pro+ plan, a Google Colab kernel can only run for 24 hours, after which it is terminated. That said, if you have a computer with GPUs, it is better to install Detectron2 on this local computer for development.

Important note – resume training option

Due to time limitations, Google Colab may terminate your runtime before your training completes. Therefore, you should train your models with a resumable option so that the Detectron2 training process can pick up the stored weights from its previous training run. Fortunately, Detectron2 supports a resumable training option so that you can do this easily.

At the time of writing this book, Detectron2 supports Linux and does not officially support Windows. You may refer to its installation page for some workarounds at https://detectron2.readthedocs.io/en/latest/tutorials/install.html if you want to install Detectron2 on Windows. This section covers the steps to install Detectron2 on Linux. Detectron2 is built on top of PyTorch. Therefore, the main installation requirement (besides Python itself) is PyTorch. Please refer to PyTorch’s official page at https://pytorch.org/ to perform the installation. Figure 1.5 shows the interface to select appropriate configurations for your current system and generate a PyTorch installation command at the bottom.

Figure 1.5: PyTorch installation command generator (https://pytorch.org)

Figure 1.5: PyTorch installation command generator (https://pytorch.org)

The next installation requirement is to install Git to install Detectron2 from source. Git is also a tool that any software developer should have. Especially since we are developing relatively complex CV applications, this tool is valuable. You can use the following steps to install and check the installed Git version from the Terminal:

$ sudo apt-get update
$ sudo apt-get install git
$ git --version

Once PyTorch and Git are installed, the steps to install Detectron2 on a local computer are the same as those used to install Detectron2 on Google Colab, described in the previous section.

Connecting Google Colab to a local development environment

There are cases where developers have developed some code with Google Colab, or they may want to use files stored on Google Drive or prefer to code with the Google Colab interface more than the standard Jupyter notebook on a local computer. In these cases, Google Colab provides an option to execute its notebook in a local environment (or other hosted runtimes such as Google Cloud instances). Google Colab has instructions for this available here: https://research.google.com/colaboratory/local-runtimes.html.

Important note – browser-specific settings

The following steps are for Google Chrome. If you are using Firefox, you must perform custom settings to allow connections from HTTPS domains with standard WebSockets. The instructions are available here: https://research.google.com/colaboratory/local-runtimes.html.

We will first need to install Jupyter on the local computer. The next step is to enable the jupyter_http_over_ws Jupyter extension using the following snippet:

$ pip install jupyter_http_over_ws
$ jupyter serverextension enable --py jupyter_http_over_ws

The next step is to start the Jupyter server on the local machine with an option to trust the WebSocket connections so that the Google Colab notebook can connect to the local runtime, using the following snippet:

$ jupyter notebook \
--NotebookApp.allow_origin=\
'https://colab.research.google.com' \
--port=8888 \
--NotebookApp.port_retries=0

Once the local Jupyter server is running, in the Terminal, there is a backend URL with an authentication token that can be used to access this local runtime from Google Colab. Figure 1.6 shows the steps to connect the Google Colab notebook to a local runtime: Connect | Connect to a local runtime:

Figure 1.6: Connecting the Google Colab notebook to a local runtime

Figure 1.6: Connecting the Google Colab notebook to a local runtime

On the next dialog, enter the backend URL generated in the local Jupyter server and click the Connect button. Congratulations! You can now use the Google Colab notebook to code Python applications using a local kernel.

 

Summary

This chapter discussed advanced CV tasks, including object detection, instance segmentation, keypoint detection, semantic segmentation, and panoptic segmentation, and when to use them. Detectron2 is a framework that helps implement cutting-edge algorithms for these CV tasks with the advantages of being faster, more accurate, modular, customizable, and built on top of PyTorch. Its architecture has four main parts: input data, backbone, region proposal, and region of interest heads. Each of these components is replaceable with a custom implementation. This chapter also provided the steps to set up a cloud development environment using Google Colab, a local development environment, or to connect Google Colab to a local runtime if needed.

You now understand the leading CV tasks Detectron2 can help develop and have set up a development environment. The next chapter (Chapter 2) will guide you through the steps to build CV applications for all the listed CV tasks using the cutting-edge models provided in the Detectron2 Model Zoo.

About the Author
  • Van Vung Pham

    Van Vung Pham is a passionate research scientist in machine learning, deep learning, data science, and data visualization. He has years of experience and numerous publications in these areas. He is currently working on projects that use deep learning to predict road damage from pictures or videos taken from roads. One of the projects uses Detectron2 and Faster R-CNN to predict and classify road damage and achieve state-of-the-art results for this task. Dr. Pham obtained his PhD from the Computer Science Department, at Texas Tech University, Lubbock, Texas, USA. He is currently an assistant professor at the Computer Science Department, Sam Houston State University, Huntsville, Texas, USA.

    Browse publications by this author
Hands-On Computer Vision with Detectron2
Unlock this book and the full library FREE for 7 days
Start now