Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Deep Learning with TensorFlow 2 and Keras - Second Edition

You're reading from  Deep Learning with TensorFlow 2 and Keras - Second Edition

Product type Book
Published in Dec 2019
Publisher Packt
ISBN-13 9781838823412
Pages 646 pages
Edition 2nd Edition
Languages
Authors (3):
Antonio Gulli Antonio Gulli
Profile icon Antonio Gulli
Amita Kapoor Amita Kapoor
Profile icon Amita Kapoor
Sujit Pal Sujit Pal
Profile icon Sujit Pal
View More author details

Table of Contents (19) Chapters

Preface Neural Network Foundations with TensorFlow 2.0 TensorFlow 1.x and 2.x Regression Convolutional Neural Networks Advanced Convolutional Neural Networks Generative Adversarial Networks Word Embeddings Recurrent Neural Networks Autoencoders Unsupervised Learning Reinforcement Learning TensorFlow and Cloud TensorFlow for Mobile and IoT and TensorFlow.js An introduction to AutoML The Math Behind Deep Learning Tensor Processing Unit Other Books You May Enjoy
Index

Reinforcement Learning

This chapter introduces reinforcement learning (RL)—the least explored and yet most promising learning paradigm. Reinforcement learning is very different from both supervised and unsupervised learning models we have done in earlier chapters. Starting from a clean slate (that is, having no prior information), the RL agent can go through multiple stages of hit and trials, and learn to achieve a goal, all the while the only input being the feedback from the environment. The latest research in RL by OpenAI seems to suggest that continuous competition can be a cause for the evolution of intelligence. Many deep learning practitioners believe that RL will play an important role in the big AI dream: Artificial General Intelligence (AGI). This chapter will delve into different RL algorithms, the following topics will be covered:

  • What is RL and its lingo
  • Learn how to use OpenAI Gym interface
  • Deep Q-Networks
  • Policy gradients

Introduction

What is common between a baby learning to walk, birds learning to fly, or an RL agent learning to play an Atari game? Well, all three involve:

  • Trial and error: The child (or the bird) tries various ways, fails many times, and succeeds in some ways before it can really stand (or fly). The RL Agent plays many games, winning some and losing many, before it can become reliably successful.
  • Goal: The child has the goal to stand, the bird to fly, and the RL agent has the goal to win the game.
  • Interaction with the environment: The only feedback they have is from their environment.

So, the first question that arises is what is RL and how is it different from supervised and unsupervised learning? Anyone who owns a pet knows that the best strategy to train a pet is rewarding it for desirable behavior and punishing it for bad behavior. RL, also called learning with a critic, is a learning paradigm where the agent learns in the same manner. The agent here corresponds...

Introduction to OpenAI Gym

As mentioned earlier, trial and error is an important component of any RL algorithm. Therefore, it makes sense to train our RL agent firstly in a simulated environment.

Today there exists a large number of platforms that can be used for the creation of an environment. Some popular ones are:

  • OpenAI Gym: It contains a collection of environments that we can use to train our RL agents. In this chapter, we'll be using the OpenAI Gym interface.
  • Unity ML-Agents SDK: It allows developers to transform games and simulations created using the Unity editor into environments where intelligent agents can be trained using DRL, evolutionary strategies, or other machine learning methods through a simple-to-use Python API. It works with TensorFlow and provides the ability to train intelligent agents for 2D/3D and VR/AR games. You can learn more about it here: https://github.com/Unity-Technologies/ml-agents.
  • Gazebo: In Gazebo, we can build three-dimensional...

Deep Q-Networks

Deep Q-networks, DQNs for short, are deep learning neural networks designed to approximate the Q-function (value-state function), it is one of the most popular value-based reinforcement learning algorithms. The model was proposed by Google's DeepMind in NIPS 2013, in the paper entitled Playing Atari with Deep Reinforcement Learning. The most important contribution of this paper was that they used the raw state space directly as input to the network; the input features were not hand-crafted as done in earlier RL implementations. Also, they could train the agent with exactly the same architecture to play different Atari games and obtain state of the art results.

This model is an extension of the simple Q-learning algorithm. In Q-learning algorithms a Q-table is maintained as a cheat sheet. After each action the Q-table is updated using the Bellman equation [5]:

The is the learning rate, and its value lies in the range [0,1]. The first term represents the...

Deep deterministic policy gradient

DQN and its variants have been very successful in solving problems where the state space is continuous and action space is discrete. For example, in Atari games, the input space consists of raw pixels, but actions are discrete - [up, down, left, right, no-op]. How do we solve a problem with continuous action space? For instance, say an RL agent driving a car needs to turn its wheels: this action has a continuous action space One way to handle this situation is by discretizing the action space and continuing with DQN or its variants. However, a better solution would be to use a policy gradient algorithm. In policy gradient methods the policy is approximated directly.

A neural network is used to approximate the policy; in the simplest form, the neural network learns a policy for selecting actions that maximize the rewards by adjusting its weights using steepest gradient ascent, hence, the name: policy gradients.

In this section we will focus...

Summary

Reinforcement learning has in recent years seen a lot of progress, to summarize all of that in a single chapter is not possible. However, in this chapter we focused on the recent successful RL algorithms. The chapter started by introducing the important concepts in the RL field, its challenges, and the solutions to move forward. Next, we delved into two important RL algorithms: the DQN and DDPG algorithms. Toward the end of this chapter the book covered the important topics in the field of deep learning. In the next chapter, we will move into applying what we have learned to production.

References

  1. https://www.technologyreview.com/s/614325/open-ai-algorithms-learned-tool-use-and-cooperation-after-hide-and-seek-games/?fbclid=IwAR1JvW-JTWnzP54bk9eCEvuJOq1y7vU4qz4OFfilWr7xHGHsILakKSD9UjY
  2. Coggan, Melanie. Exploration and Exploitation in Reinforcement Learning. Research supervised by Prof. Doina Precup, CRA-W DMP Project at McGill University (2004).
  3. Lin, Long-Ji. Reinforcement learning for robots using neural networks. No. CMU-CS-93-103. Carnegie-Mellon University Pittsburgh PA School of Computer Science, 1993.
  4. Schaul, Tom, John Quan, Ioannis Antonoglou, and David Silver. Prioritized Experience Replay. arXiv preprint arXiv:1511.05952 (2015).
  5. Chapter 4, Reinforcement Learning, Richard Sutton and Andrew Barto, MIT Press. https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf.
  6. Dabney, Will, Mark Rowland, Marc G. Bellemare, and Rémi Munos. Distributional Reinforcement Learning with Quantile Regression. In Thirty-Second AAAI...
lock icon The rest of the chapter is locked
You have been reading a chapter from
Deep Learning with TensorFlow 2 and Keras - Second Edition
Published in: Dec 2019 Publisher: Packt ISBN-13: 9781838823412
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}