Reader small image

You're reading from  PyTorch 1.x Reinforcement Learning Cookbook

Product typeBook
Published inOct 2019
Reading LevelIntermediate
PublisherPackt
ISBN-139781838551964
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Yuxi (Hayden) Liu
Yuxi (Hayden) Liu
author image
Yuxi (Hayden) Liu

Yuxi (Hayden) Liu was a Machine Learning Software Engineer at Google. With a wealth of experience from his tenure as a machine learning scientist, he has applied his expertise across data-driven domains and applied his ML expertise in computational advertising, cybersecurity, and information retrieval. He is the author of a series of influential machine learning books and an education enthusiast. His debut book, also the first edition of Python Machine Learning by Example, ranked the #1 bestseller in Amazon and has been translated into many different languages.
Read more about Yuxi (Hayden) Liu

Right arrow

Capstone Project – Playing Flappy Bird with DQN

In this very last chapter, we will work on a capstone project—playing Flappy Bird using reinforcement learning. We will apply what we have learned throughout this book to build an intelligent bot. We will also focus on building Deep Q-Networks (DQNs), fine-tuning model parameters, and deploying the model. Let's see how long the bird can stay in the air.

The capstone project will be built section by section in the following recipes:

  • Setting up the game environment
  • Building a Deep Q-Network to play Flappy Bird
  • Training and tuning the network
  • Deploying the model and playing the game

As a result, the code in each recipe is to be built on top of the previous recipes.

Setting up the game environment

To play Flappy Bird with a DQN, we first need to set up the environment.

We’ll simulate the Flappy Bird game using Pygame. Pygame (https://www.pygame.org) contains a set of Python modules developed for creating video games. It also includes graphics and sound libraries needed in games. We can install the Pygame package as follows:

pip install pygame

Flappy Bird is a famous mobile game originally developed by Dong Nguyen. You can try it yourself, using your keyboard, at https://flappybird.io/. The aim of the game is to remain alive as long as possible. The game ends when the bird touches the floor or a pipe. So, the bird needs to flap its wings at the right times to get through the random pipes and to avoid falling to the ground. Possible actions include flapping and not flapping. In the game environment, the reward is +0.1 for every step...

Building a Deep Q-Network to play Flappy Bird

Now that the Flappy Bird environment is ready, we can start tackling it by building a DQN model.

As we have seen, a screen image is returned at each step after an action is taken. A CNN is one of the best neural network architectures to deal with image inputs. In a CNN, the convolutional layers are able to effectively extract features from images, which will be passed on to fully connected layers downstream. In our solution, we will use a CNN with three convolutional layers and one fully connected hidden layer. An example of CNN architecture is as follows:

How to do it...

Let's develop a CNN-based DQN model as follows:

  1. Import the necessary modules:
>>> import...

Training and tuning the network

In this recipe, we will train the DQN model to play Flappy Bird.

In each step of the training, we take an action following the epsilon-greedy policy: under a certain probability (epsilon), we will take a random action, flapping or not flapping in our case; otherwise, we select the action with the highest value. We also adjust the value of epsilon for each step as we favor more exploration at the beginning and more exploitation when the DQN model is getting more mature.

As we have seen, the observation for each step is a two-dimensional image of the screen. We need to transform the observation images into states. Simply using one image from a step will not provide enough information to guide the agent as to how to react. Hence, we form a state using images from four adjacent steps. We will first reshape the image into the expected size, then concatenate...

Deploying the model and playing the game

Now that we've trained the DQN model, let's apply it to play the Flappy Bird game.

Playing the game with the trained model is simple. We will just take the action associated with the highest value in each step. We will play a few episodes to see how it performs. Don’t forget to preprocess the raw screen image and construct the state.

How to do it...

We test the DQN model on new episodes as follows:

  1. We first load the final model:
>>> model = torch.load("{}/final".format(saved_path))
  1. We run 100 episodes, and we perform the following for each episode:
>>> n_episode = 100
>>> for episode in range(n_episode):

... env = FlappyBird...
lock icon
The rest of the chapter is locked
You have been reading a chapter from
PyTorch 1.x Reinforcement Learning Cookbook
Published in: Oct 2019Publisher: PacktISBN-13: 9781838551964
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Yuxi (Hayden) Liu

Yuxi (Hayden) Liu was a Machine Learning Software Engineer at Google. With a wealth of experience from his tenure as a machine learning scientist, he has applied his expertise across data-driven domains and applied his ML expertise in computational advertising, cybersecurity, and information retrieval. He is the author of a series of influential machine learning books and an education enthusiast. His debut book, also the first edition of Python Machine Learning by Example, ranked the #1 bestseller in Amazon and has been translated into many different languages.
Read more about Yuxi (Hayden) Liu