Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Hands-On Intelligent Agents with OpenAI Gym
Hands-On Intelligent Agents with OpenAI Gym

Hands-On Intelligent Agents with OpenAI Gym: Your guide to developing AI agents using deep reinforcement learning

By Palanisamy P
€25.99 €17.99
Book Jul 2018 254 pages 1st Edition
eBook
€25.99 €17.99
Print
€32.99
Subscription
€14.99 Monthly
eBook
€25.99 €17.99
Print
€32.99
Subscription
€14.99 Monthly

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Jul 31, 2018
Length 254 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781788836579
Vendor :
OpenAI
Category :
Table of content icon View table of contents Preview book icon Preview Book

Hands-On Intelligent Agents with OpenAI Gym

Introduction to Intelligent Agents and Learning Environments

Greetings! Welcome to the first chapter of this book. This book will introduce you to the awesome OpenAI Gym learning environment and guide you through an exciting journey to get you equipped with enough skills to train state-of-the-art, artificial intelligence agent-based systems. This book will help you develop hands-on experience with reinforcement learning and deep reinforcement learning through practical projects ranging from developing an autonomous, self-driving car to developing Atari game-playing agents that can surpass human performance. By the time you complete the book, you will be in a position to explore the endless possibilities of using artificial intelligence to solve algorithmic tasks, play games, and fix control problems.

The following topics will be covered in this chapter:

  • Understanding intelligent agents and learning environments
  • Understanding what OpenAI Gym is all about
  • Different categories of tasks/environments that are available, with a brief description of what each category is suitable for
  • Understanding the key features of OpenAI Gym
  • Getting an idea about what you can do with the OpenAI Gym toolkit
  • Creating and visualizing your first Gym environment

Let's start our journey by understanding what an intelligent agent is.

What is an intelligent agent?

A major goal of artificial intelligence is to build intelligent agents. Perceiving their environment, understanding, reasoning and learning to plan, and making decisions and acting upon them are essential characteristics of intelligent agents. We will begin our first chapter by understanding what an intelligent agent is, from the basic definition of agents, to adding intelligence on top of that.

An agent is an entity that acts based on the observation (perception) of its environment. Humans and robots are examples of agents with physical forms.

A human, or an animal, is an example of an agent that uses its organs (eyes, ears, nose, skin, and so on) as sensors to observe/perceive its environment and act using their physical body (arms, hands, legs, head, and so on). A robot uses its sensors (cameras, microphones, LiDAR, radar, and so on) to observe/perceive its environment and act using its physical robotic body (robotic arms, robotic hands/grippers, robotic legs, speakers, and so on).

Software agents are computer programs that are capable of making decisions and taking actions through interaction with their environment. A software agent can be embodied in a physical form, such as a robot. Autonomous agents are entities that make decisions autonomously and take actions based on their understanding of and reasoning about their observations of their environment.

An intelligent agent is an autonomous entity that can learn and improve based on its interactions with its environment. An intelligent agent is capable of analyzing its own behavior and performance using its observations.

In this book, we will develop intelligent agents to solve sequential decision-making problems that can be solved using a sequence of (independent) decisions/actions in a (loosely) Markovian environment, where feedback in the form of reward signals is available (through percepts), at least in some environmental conditions.

Learning environments

A learning environment is an integral component of a system where an intelligent agent can be trained to develop intelligent systems. The learning environment defines the problem or the task for the agent to complete.

A problem or task in which the outcome depends on a sequence of decisions made or actions taken is a sequential decision-making problem. Here are some of the varieties of learning environments:

  • Fully observable versus partially observable
  • Deterministic versus stochastic
  • Episodic versus sequential
  • Static versus dynamic
  • Discrete versus continuous
  • Discrete state space versus continuous state space
  • Discrete action space versus continuous action space

In this book, we will be using learning environments implemented using the OpenAI Gym Python library, as it provides a simple and standard interface and environment implementations, along with the ability to implement new custom environments.

In the following subsections, we will get a glimpse of the OpenAI Gym toolkit. This section is geared towards familiarizing a complete newbie with the OpenAI Gym toolkit. No prior knowledge or experience is assumed. We will first try to get a feel for the Gym toolkit and walk through the various environments that are available under different categories. We will then discuss the features of Gym that might be of interest to you, irrespective of the application domain that you are interested in. We'll then briefly discuss what the value proposition of the Gym toolkit is and how you can utilize it. We will be building several cool and intelligent agents in subsequent chapters, building on top of the Gym toolkit. So, this chapter is really the foundation for all that. We will also be quickly creating and visualizing our first OpenAI Gym environment towards the end of this chapter. Excited? Let's jump right in.

What is OpenAI Gym?

OpenAI Gym is an open source toolkit that provides a diverse collection of tasks, called environments, with a common interface for developing and testing your intelligent agent algorithms. The toolkit introduces a standard Application Programming Interface (API) for interfacing with environments designed for reinforcement learning. Each environment has a version attached to it, which ensures meaningful comparisons and reproducible results with the evolving algorithms and the environments themselves.

The Gym toolkit, through its various environments, provides an episodic setting for reinforcement learning, where an agent's experience is broken down into a series of episodes. In each episode, the initial state of the agent is randomly sampled from a distribution, and the interaction between the agent and the environment proceeds until the environment reaches a terminal state. Do not worry if you are not familiar with reinforcement learning. You will be introduced to reinforcement learning in Chapter 2, Reinforcement Learning and Deep Reinforcement Learning.

Some of the basic environments available in the OpenAI Gym library are shown in the following screenshot:

Examples of basic environments available in the OpenAI Gym with a short description of the task

At the time of writing this book, the OpenAI Gym natively has about 797 environments spread over different categories of tasks. The famous Atari category has the largest share with about 116 (half with screen inputs and half with RAM inputs) environments! The categories of tasks/environments supported by the toolkit are listed here:

  • Algorithmic
  • Atari
  • Board games
  • Box2D
  • Classic control
  • Doom (unofficial)
  • Minecraft (unofficial)
  • MuJoCo
  • Soccer
  • Toy text
  • Robotics (newly added)

The various types of environment (or tasks) available under the different categories, along with a brief description of each environment, is given next. Keep in mind that you may need some additional tools and packages installed on your system to run environments in each of these categories. Do not worry! We will go over every single step you need to do to get any environment up and running in the upcoming chapters. Stay tuned!

We will now see the previously mentioned categories in detail, as follows:

  • Algorithmic environments: They provide tasks that require an agent to perform computations, such as the addition of multi-digit numbers, copying data from an input sequence, reversing sequences, and so on.
  • Atari environments: These offer interfaces to several classic Atari console games. These environment interfaces are wrappers on top of the Arcade Learning Environment (ALE). They provide the game's screen images or RAM as input to train your agents.
  • Board games: This category has the environment for the popular game Go on 9 x 9 and 19 x 19 boards. For those of you who have been following the recent breakthroughs by Google's DeepMind in the game of Go, this might be very interesting. DeepMind developed an agent named AlphaGo, which used reinforcement learning and other learning and planning techniques, including Monte Carlo tree search, to beat the top-ranked human Go players in the world, including Fan Hui and Lee Sedol. DeepMind also published their work on AlphaGo Zero, which was trained from scratch, unlike the original AlphaGo, which used sample games played by humans. AlphaGo Zero surpassed the original AlphaGo's performance. Later, AlphaZero was published; it is an autonomous system that learned to play chess, Go, and Shogi using self-play (without any human supervision for training) and reached performance levels higher than the previous systems developed.
  • Box2D: This is an open source physics engine used for simulating rigid bodies in 2D. The Gym toolkit has a few continuous control tasks that are developed using the Box2D simulator:
A sample list of environments built using the Box2D simulator

The tasks include training a bipedal robot to walk, navigating a lunar lander to its landing pad, and training a race car to drive around a race track. Exciting! In this book, we will train an AI agent using reinforcement learning to drive a race car around the track autonomously! Stay tuned.

  • Classic control: This category has many tasks developed for it and was used widely in reinforcement learning literature in the past. These tasks formed the basis for some of the early development and benchmarking of reinforcement learning algorithms. For example, one of the environments available under the classic control category is the Mountain Car environment, which was first introduced in 1990 by Andrew Moore (Dean of the School of Computer Science at CMU, and Pittsburgh founder) in his PhD thesis. This environment is still used sometimes as a test bed for reinforcement learning algorithms. You will create your first OpenAI Gym environment from this category in just a few moments towards the end of this chapter!
  • Doom: This category provides an environment interface for the popular first-person shooter game Doom. It is an unofficial, community-created Gym environment category and is based on ViZDoom, which is a Doom-based AI research platform providing an easy-to-use API suitable for developing intelligent agents from raw visual inputs. It enables the development of AI bots that can play several challenging rounds of the Doom game using only the screen buffer! If you have played this game, you know how thrilling and difficult it is to progress through some of the rounds without losing lives in the game! Although this is not a game with cool graphics like some of the new first-person shooter games, the visuals aside, it is a great game. In recent times, several studies in machine learning, especially in deep reinforcement learning, have utilized the ViZDoom platform and have developed new algorithms to tackle the goal-directed navigation problems encountered in the game. You can visit ViZDoom's research web page (http://vizdoom.cs.put.edu.pl/research) for a list of research studies that use this platform. The following screenshot lists some of the missions that are available as separate environments in the Gym for training your agents:
List of missions or rounds available in Doom environments
  • MineCraft: This is another great platform. Game AI developers especially might be very much interested in this environment. MineCraft is a popular video game among hobbyists. The MineCraft Gym environment was built using Microsoft's Malmo project, which is a platform for artificial intelligence experimentation and research built on top of Minecraft. Some of the missions that are available as environments in the OpenAI Gym are shown in the following screenshot. These environments provide inspiration for developing solutions to challenging new problems presented by this unique environment:
Environments in MineCraft available in OpenAI Gym
  • MuJoCo: Are you interested in robotics? Do you dream of developing algorithms that can make a humanoid walk and run, or do a backflip like Boston Dynamic's Atlas Robot? You can! You will be able to apply the reinforcement learning methods you will learn in this book in the OpenAI Gym MuJoCo environment to develop your own algorithm that can make a 2D robot walk, run, swim, or hop, or make a 3D multi-legged robot walk or run! In the following screenshot, there are some cool, real-world, robot-like environments available under the MuJoCo environment:
  • Soccer: This an environment suitable for training multiple agents that can cooperate together. The soccer environments available through the Gym toolkit have continuous state and action spaces. Wondering what that means? You will learn all about it when we talk about reinforcement learning in the next chapter. For now, here is a simple explanation: a continuous state and action space means that the action that an agent can take and the input that the agent receives are both continuous values. This means that they can take any real number value between, say, 0 and 1 (0.5, 0.005, and so on), rather than being limited to a few discrete sets of values, such as {1, 2, 3}. There are three types of environment. The plain soccer environment initializes a single opponent on the field and gives a reward of +1 for scoring a goal and 0 otherwise. In order for an agent to score a goal, it will need to learn to identify the ball, approach the ball, and kick the ball towards the goal. Sound simple enough? But it is really hard for a computer to figure that out on its own, especially when all you say is +1 when it scores a goal and 0 in any other case. It does not have any other clues! You can develop agents that will learn all about soccer by themselves and learn to score goals using the methods that you will learn in this book.
  • Toy text: OpenAI Gym also has some simple text-based environments under this category. These include some classic problems such as Frozen Lake, where the goal is to find a safe path to cross a grid of ice and water tiles. It is categorized under toy text because it uses a simpler environment representation—mostly through text.

With that, you have a very good overview of all the different categories and types of environment that are available as part of the OpenAI Gym toolkit. It is worth noting that the release of the OpenAI Gym toolkit was accompanied by an OpenAI Gym website (gym.openai.com), which maintained a scoreboard for every algorithm that was submitted for evaluation. It showcased the performance of user-submitted algorithms, and some submissions were also accompanied by detailed explanations and source code. Unfortunately, OpenAI decided to withdraw support for the evaluation website. The service went offline in September 2017.

Now you have a good picture of the various categories of environment available in OpenAI Gym and what each category provides you with. Next, we will look at the key features of OpenAI Gym that make it an indispensable component in many of today's advancements in intelligent agent development, especially those that use reinforcement learning or deep reinforcement learning.

Understanding the features of OpenAI Gym

In this section, we will take a look at the key features that have made the OpenAI Gym toolkit very popular in the reinforcement learning community and led to it becoming widely adopted.

Simple environment interface

OpenAI Gym provides a simple and common Python interface to environments. Specifically, it takes an action as input and provides observation, reward, done and an optional info object, based on the action as the output at each step. If this does not make perfect sense to you yet, do not worry. We will go over the interface again in a more detailed manner to help you understand. This paragraph is just to give you an overview of the interface to make it clear how simple it is. This provides great flexibility for users as they can design and develop their agent algorithms based on any paradigm they like, and not be constrained to use any particular paradigm because of this simple and convenient interface.

Comparability and reproducibility

We intuitively feel that we should be able to compare the performance of an agent or an algorithm in a particular task to the performance of another agent or algorithm in the same task. For example, if an agent gets a score of 1,000 on average in the Atari game of Space Invaders, we should be able to tell that this agent is performing worse than an agent that scores 5000 on average in the Space Invaders game in the same amount of training time. But what happens if the scoring system for the game is slightly changed? Or if the environment interface was modified to include additional information about the game states that will provide an advantage to the second agent? This would make the score-to-score comparison unfair, right?

To handle such changes in the environment, OpenAI Gym uses strict versioning for environments. The toolkit guarantees that if there is any change to an environment, it will be accompanied by a different version number. Therefore, if the original version of the Atari Space Invaders game environment was named SpaceInvaders-v0 and there were some changes made to the environment to provide more information about the game states, then the environment's name would be changed to SpaceInvaders-v1. This simple versioning system makes sure we are always comparing performance measured on the exact same environment setup. This way, the results obtained are comparable and reproducible.

Ability to monitor progress

All the environments available as part of the Gym toolkit are equipped with a monitor. This monitor logs every time step of the simulation and every reset of the environment. What this means is that the environment automatically keeps track of how our agent is learning and adapting with every step. You can even configure the monitor to automatically record videos of the game while your agent is learning to play. How cool is that?

What can you do with the OpenAI Gym toolkit?

The Gym toolkit provides a standardized way of defining the interface for environments developed for problems that can be solved using reinforcement learning. If you are familiar with or have heard of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), you may realize how much of an impact a standard benchmarking platform can have on accelerating research and development. For those of you who are not familiar with ILSVRC, here is a brief summary: it is a competition where the participating teams evaluate the supervised learning algorithms they have developed for the given dataset and compete to achieve higher accuracy with several visual recognition tasks. This common platform, coupled with the success of deep neural network-based algorithms popularized by AlexNet (https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf), paved the way for the deep learning era we are in at the moment.

In a similar way, the Gym toolkit provides a common platform to benchmark reinforcement learning algorithms and encourages researchers and engineers to develop algorithms that can achieve higher rewards for several challenging tasks. In short, the Gym toolkit is to reinforcement learning what ILSVRC is to supervised learning.

Creating your first OpenAI Gym environment

We will be going over the steps to set up the OpenAI Gym dependencies and other tools required for training your reinforcement learning agents in detail in Chapter 3, Getting Started with OpenAI Gym and Deep Reinforcement Learning. This section provides a quick way to get started with the OpenAI Gym Python API on Linux and macOS using virtualenv so that you can get a sneak peak into the Gym!

MacOS and Ubuntu Linux systems come with Python installed by default. You can check which version of Python is installed by running python --version from a terminal window. If this returns python followed by a version number, then you are good to proceed to the next steps! If you get an error saying the Python command was not found, then you have to install Python. Please refer to the detailed installation section in Chapter 3, Getting Started with OpenAI Gym and Deep Reinforcement Learning of this book:

  1. Install virtualenv:
$pip install virtualenv
If pip is not installed on your system, you can install it by typing sudo easy_install pip.
  1. Create a virtual environment named openai-gym using the virtualenv tool:
 $virtualenv openai-gym
  1. Activate the openai-gym virtual environment:
$source openai-gym/bin/activate
  1. Install all the packages for the Gym toolkit from upstream:
$pip install -U gym
If you get permission denied or failed with error code 1 when you run the pip install command, it is most likely because the permissions on the directory you are trying to install the package to (the openai-gym directory inside virtualenv in this case) needs special/root privileges. You can either run sudo -H pip install -U gym[all] to solve the issue or change permissions on the openai-gym directory by running sudo chmod -R o+rw ~/openai-gym.
  1. Test to make sure the installation is successful:
$python -c 'import gym; gym.make("CartPole-v0");'

Creating and visualizing a new Gym environment

In just a minute or two, you have created an instance of an OpenAI Gym environment to get started!

Let's open a new Python prompt and import the gym module:

>>import gym

Once the gym module is imported, we can use the gym.make method to create our new environment like this:

>>env = gym.make('CartPole-v0')
>>env.reset()
env.render()

This will bring up a window like this:

Hooray!

Summary

Congrats on completing the first chapter! Hope you had fun creating your own environment. In this chapter, you learned what OpenAI Gym is all about, what features it provides, and what you can do with the toolkit. You now have a very good idea about OpenAI Gym. In the next chapter, we will go over the basics of reinforcement learning to give you a good foundation, which will help you build your cool intelligent agents as you progress through the book. Excited? Move on to the next chapter!

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Explore the OpenAI Gym toolkit and interface to use over 700 learning tasks
  • Implement agents to solve simple to complex AI problems
  • Study learning environments and discover how to create your own

Description

Many real-world problems can be broken down into tasks that require a series of decisions to be made or actions to be taken. The ability to solve such tasks without a machine being programmed requires a machine to be artificially intelligent and capable of learning to adapt. This book is an easy-to-follow guide to implementing learning algorithms for machine software agents in order to solve discrete or continuous sequential decision making and control tasks. Hands-On Intelligent Agents with OpenAI Gym takes you through the process of building intelligent agent algorithms using deep reinforcement learning starting from the implementation of the building blocks for configuring, training, logging, visualizing, testing, and monitoring the agent. You will walk through the process of building intelligent agents from scratch to perform a variety of tasks. In the closing chapters, the book provides an overview of the latest learning environments and learning algorithms, along with pointers to more resources that will help you take your deep reinforcement learning skills to the next level.

What you will learn

Explore intelligent agents and learning environments Understand the basics of RL and deep RL Get started with OpenAI Gym and PyTorch for deep reinforcement learning Discover deep Q learning agents to solve discrete optimal control tasks Create custom learning environments for real-world problems Apply a deep actor-critic agent to drive a car autonomously in CARLA Use the latest learning environments and algorithms to upgrade your intelligent agent development skills

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Jul 31, 2018
Length 254 pages
Edition : 1st Edition
Language : English
ISBN-13 : 9781788836579
Vendor :
OpenAI
Category :

Table of Contents

12 Chapters
Preface Chevron down icon Chevron up icon
Introduction to Intelligent Agents and Learning Environments Chevron down icon Chevron up icon
Reinforcement Learning and Deep Reinforcement Learning Chevron down icon Chevron up icon
Getting Started with OpenAI Gym and Deep Reinforcement Learning Chevron down icon Chevron up icon
Exploring the Gym and its Features Chevron down icon Chevron up icon
Implementing your First Learning Agent - Solving the Mountain Car problem Chevron down icon Chevron up icon
Implementing an Intelligent Agent for Optimal Control using Deep Q-Learning Chevron down icon Chevron up icon
Creating Custom OpenAI Gym Environments - CARLA Driving Simulator Chevron down icon Chevron up icon
Implementing an Intelligent - Autonomous Car Driving Agent using Deep Actor-Critic Algorithm Chevron down icon Chevron up icon
Exploring the Learning Environment Landscape - Roboschool, Gym-Retro, StarCraft-II, DeepMindLab Chevron down icon Chevron up icon
Exploring the Learning Algorithm Landscape - DDPG (Actor-Critic), PPO (Policy-Gradient), Rainbow (Value-Based) Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Filter icon Filter
Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%

Filter reviews by


No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.