Reader small image

You're reading from  Mastering Reinforcement Learning with Python

Product typeBook
Published inDec 2020
Reading LevelBeginner
PublisherPackt
ISBN-139781838644147
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Enes Bilgin
Enes Bilgin
author image
Enes Bilgin

Enes Bilgin works as a senior AI engineer and a tech lead in Microsoft's Autonomous Systems division. He is a machine learning and operations research practitioner and researcher with experience in building production systems and models for top tech companies using Python, TensorFlow, and Ray/RLlib. He holds an M.S. and a Ph.D. in systems engineering from Boston University and a B.S. in industrial engineering from Bilkent University. In the past, he has worked as a research scientist at Amazon and as an operations research scientist at AMD. He also held adjunct faculty positions at the McCombs School of Business at the University of Texas at Austin and at the Ingram School of Engineering at Texas State University.
Read more about Enes Bilgin

Right arrow

Need for policy-based methods

We start this chapter by first discussing why we need policy-based methods as we have already introduced many value-based methods. Policy-based methods i) are arguably more principled as they directly optimize the policy parameters, ii) allow us to use continuous action spaces, and iii) are able to learn truly random stochastic policies. Let's now go into the details of each of these points.

A more principled approach

In Q-learning, a policy is obtained in an indirect manner by learning action values, which are then used to determine the best action(s). But do we really need to know the value of an action? Most of the time we don't, as they are only proxies to get us to optimal policies. Policy-based methods learn function approximations that directly give policies without such an intermediate step. This is arguably a more principled approach because we can take gradient steps directly to optimize the policy, not the proxy action-value...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Mastering Reinforcement Learning with Python
Published in: Dec 2020Publisher: PacktISBN-13: 9781838644147

Author (1)

author image
Enes Bilgin

Enes Bilgin works as a senior AI engineer and a tech lead in Microsoft's Autonomous Systems division. He is a machine learning and operations research practitioner and researcher with experience in building production systems and models for top tech companies using Python, TensorFlow, and Ray/RLlib. He holds an M.S. and a Ph.D. in systems engineering from Boston University and a B.S. in industrial engineering from Bilkent University. In the past, he has worked as a research scientist at Amazon and as an operations research scientist at AMD. He also held adjunct faculty positions at the McCombs School of Business at the University of Texas at Austin and at the Ingram School of Engineering at Texas State University.
Read more about Enes Bilgin