Using Deep Reinforcement Learning

In this chapter, we're going to be using deep neural networks in a slightly different way. Rather than predicting the membership of a class, estimating a value, or even generating a sequence, we're going to be building an intelligent agent. While the terms machine learning and artificial intelligence are often used interchangeably, in this chapter we will talk about an artificial intelligence as an intelligent agent that can perceive it's environment, and take steps to accomplish some goal in that environment.

Imagine an agent that can play a strategy game such as Chess or Go. A very naive approach to building a neural network to solve such a game might be to use a network architecture where we one hot encode every possible board/piece combination and then predict every possible next move. As massive and complex as that network...

Reinforcement learning overview

Reinforcement learning is based on the concept of an intelligent agent. An agent interacts with it's environment by observing some state and then taking an action. As the agent takes actions to move between states, it receives feedback about the goodness of its actions in the form of a reward signal. This reward signal is the reinforcement in reinforcement learning. It's a feedback loop that the agent can use to learn the goodness of it's choice. Of course, rewards can be both positive and negative (punishments).

Imagine a self-driving car as the agent we are building. As it's driving down the road, it's receiving a constant stream of reward signals for it's actions. Staying within the lanes would likely lead to a positive reward while running over pedestrians would likely result in a very negative reward for the agent...

The Keras reinforcement learning framework

At this point, we should have just enough background to start building a deep Q network, but there's still a pretty big hurdle we need to overcome.

Implementing an agent that utilizes deep reinforcement learning can be quite a challenge, however the Keras-RL library originally authored by Matthias Plappert makes it much easier. I'll be using his library to power the agents presented in this chapter.

Of course, our agent can't have much fun without an environment. I'll be using the OpenAI gym, which provides many environments, complete with states and reward functions, that we can easily use to build worlds for our agents to explore.

Installing Keras-RL

Keras-RL...

Building a reinforcement learning agent in Keras

Good news, we're finally ready to start coding. In this section, I'm going to demonstrate two Keras-RL agents called CartPole and Lunar Lander. I've chosen these examples because they won't consume your GPU and your cloud budget to run. They can be easily extended to Atari problems, and I've included one of those as well in the book's Git repository. You can find all this code in the Chapter12 folder, as usual. Let's talk quickly about these two environments:

CartPole: The CartPole environment consists of a pole, balanced on a cart. The agent has to learn how to balance the pole vertically, while the cart underneath it moves. The agent is given the position of the cart, the velocity of the cart, the angle of the pole, and the rotational rate of the pole as inputs. The agent can apply a force on...

Summary

Stanford teaches an entire course only on reinforcement learning. It would have been possible to write an entire book just on reinforcement learning, and in fact that has been done many times. My hope for this chapter is to show you just enough to start you on your way towards solving reinforcement learning problems.

As I solved the Lunar Lander problem, it was easy to let my mind wander from toy problems to actual space exploration with deep Q network-powered agents. I hope this chapter does the same for you.

In the next chapter, I'll show you one last use of Deep Neural networks where we will look at networks that can generate new images, data points, and and even music, called Generative Adversarial Networks.