Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Deep Reinforcement Learning Hands-On

You're reading from  Deep Reinforcement Learning Hands-On

Product type Book
Published in Jun 2018
Publisher Packt
ISBN-13 9781788834247
Pages 546 pages
Edition 1st Edition
Languages
Author (1):
Maxim Lapan Maxim Lapan
Profile icon Maxim Lapan

Table of Contents (23) Chapters

Deep Reinforcement Learning Hands-On
Contributors
Preface
Other Books You May Enjoy
What is Reinforcement Learning? OpenAI Gym Deep Learning with PyTorch The Cross-Entropy Method Tabular Learning and the Bellman Equation Deep Q-Networks DQN Extensions Stocks Trading Using RL Policy Gradients – An Alternative The Actor-Critic Method Asynchronous Advantage Actor-Critic Chatbots Training with RL Web Navigation Continuous Action Space Trust Regions – TRPO, PPO, and ACKTR Black-Box Optimization in RL Beyond Model-Free – Imagination AlphaGo Zero Index

The Bellman equation of optimality


To explain the Bellman equation, it's better to go a bit abstract. Don't be afraid, I'll provide the concrete examples later to support your intuition! Let's start with a deterministic case, when all our actions have a 100% guaranteed outcome. Imagine that our agent observes state and has N available actions. Every action leads to another state, , with a respective reward, . Also assume that we know the values, , of all states connected to the state . What will be the best course of action that the agent can take in such a state?

Figure 3: An abstract environment with N states reachable from the initial state

If we choose the concrete action , and calculate the value given to this action, then the value will be . So, to choose the best possible action, the agent needs to calculate the resulting values for every action and choose the maximum possible outcome. In other words: . If we're using discount factor , we need to multiply the value of the next state...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}