Reader small image

You're reading from  Mastering Reinforcement Learning with Python

Product typeBook
Published inDec 2020
Reading LevelBeginner
PublisherPackt
ISBN-139781838644147
Edition1st Edition
Languages
Tools
Right arrow
Author (1)
Enes Bilgin
Enes Bilgin
author image
Enes Bilgin

Enes Bilgin works as a senior AI engineer and a tech lead in Microsoft's Autonomous Systems division. He is a machine learning and operations research practitioner and researcher with experience in building production systems and models for top tech companies using Python, TensorFlow, and Ray/RLlib. He holds an M.S. and a Ph.D. in systems engineering from Boston University and a B.S. in industrial engineering from Bilkent University. In the past, he has worked as a research scientist at Amazon and as an operations research scientist at AMD. He also held adjunct faculty positions at the McCombs School of Business at the University of Texas at Austin and at the Ingram School of Engineering at Texas State University.
Read more about Enes Bilgin

Right arrow

Introducing the reward: Markov reward process

In our robot example so far, we have not really identified any situation/state that is "good" or "bad." In any system though, there are desired states to be in and there are other states that we want to avoid. In this section, we attach rewards to states/transitions, which gives us a Markov Reward Process (MRP). We then assess the "value" of each state.

Attaching rewards to the grid world example

Remember the version of the robot example where it could not bounce back to the cell it was in when it hits a wall, but crashes in a way that it is not recoverable. From now on, we will work on that version, and attach rewards to the process. Now, let's build this example:

  1. We modify the transition probability matrix to assign self-transition probabilities to the "crashed" state that we add to the matrix:
    P = np.zeros((m2 + 1, m2 + 1))
    P[:m2, :m2] = get_P(3, 0.2, 0.3, 0.25, 0.25)
    for i in...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Mastering Reinforcement Learning with Python
Published in: Dec 2020Publisher: PacktISBN-13: 9781838644147

Author (1)

author image
Enes Bilgin

Enes Bilgin works as a senior AI engineer and a tech lead in Microsoft's Autonomous Systems division. He is a machine learning and operations research practitioner and researcher with experience in building production systems and models for top tech companies using Python, TensorFlow, and Ray/RLlib. He holds an M.S. and a Ph.D. in systems engineering from Boston University and a B.S. in industrial engineering from Bilkent University. In the past, he has worked as a research scientist at Amazon and as an operations research scientist at AMD. He also held adjunct faculty positions at the McCombs School of Business at the University of Texas at Austin and at the Ingram School of Engineering at Texas State University.
Read more about Enes Bilgin