Chapter 13. Reinforcement Learning in Image Processing
In this chapter, we will cover one of the most famous application domains in the artificial intelligence (AI) community, computer vision. Applying AI to images and videos has been going on for over two decades now. With better computational power, algorithms such as convolutional neural networks (CNNs) and its variants have worked fairly well in object detection tasks. Advanced steps have been taken towards automated image captioning, diabetic retinopathy, video object detection, captioning, and a lot more.
Due to its promising results and more generalized approach, applying reinforcement learning to computer vision successfully forms challenging tasks for researchers. We have seen how AlphaGo and AlphaGo Zero have outperformed professional human Go players, where the deep reinforcement learning approach is applied to the image of the game board at each step.
Therefore, here in this chapter we will be covering the most famous domain in...
Hierarchical object detection with deep reinforcement learning
In this section, we will try to understand how deep reinforcement learning can be applied for hierarchical object detection as per the framework suggested in Hierarchical Object Detection with Deep Reinforcement Learning by Bellver et. al. (2016)(https://arxiv.org/pdf/1611.03718.pdf). This experiment showcases a method to perform hierarchical object detection in images using deep reinforcement learning with the main focus on important parts of the image carrying richer information. The objective here was to train a deep reinforcement learning agent to which an image window is given and the image gets further segregated into five smaller windows and the agent is successfully able to focus its attention on one of the smaller windows.
Now let's consider how we humans look at an image. We always extract information in a sequential manner to understand the content of the image:
- First, we focus on the most important part of the image...
In this chapter, we went through different state of the art approaches in object detection such as R-CNN, Fast R-CNN, Faster R-CNN, YOLO, SSD, and others. Furthermore, we explored an approach given by Hierarchical Object Detection with Deep Reinforcement Learningby Bellver et. al. (2016). As per this approach we learnt how to create an MDP framework for object detection and hierarchically detect objects in a top-bottom exploration approach in minimal time steps. Object detection in an image is one application in computer vision. There are other domains such as object detection in videos, video tagging, and many more where reinforcement learning can create state of the art learning agents.
In the next chapter, we will learn how reinforcement learning can be applied in the domain of NLP (natural language processing).