Reader small image

You're reading from  Deep Reinforcement Learning Hands-On. - Second Edition

Product typeBook
Published inJan 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781838826994
Edition2nd Edition
Languages
Right arrow
Author (1)
Maxim Lapan
Maxim Lapan
author image
Maxim Lapan

Maxim has been working as a software developer for more than 20 years and was involved in various areas: distributed scientific computing, distributed systems and big data processing. Since 2014 he is actively using machine and deep learning to solve practical industrial tasks, such as NLP problems, RL for web crawling and web pages analysis. He has been living in Germany with his family.
Read more about Maxim Lapan

Right arrow

The model application

Okay, imagine that we have trained the model using the process just described. How should we use it to solve the scrambled cube? From the network's structure, you might imagine the obvious, but not very successful, way:

  1. Feed the model the current state of the cube that we want to solve
  2. From the policy head, get the largest action to perform (or sample it from the resulting distribution)
  3. Apply the action to the cube
  4. Repeat the process until the solved state has been reached

On paper, this method should work, but in practice, it has one serious issue: it doesn't! The main reason for that is our model's quality. Due to the size of the state space and the nature of the NNs, it just isn't possible to train an NN to return the exact optimal action for any input state all of the time. Rather than telling us what to do to get the solved state, our model shows us promising directions to explore. Those directions could bring...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Deep Reinforcement Learning Hands-On. - Second Edition
Published in: Jan 2020Publisher: PacktISBN-13: 9781838826994

Author (1)

author image
Maxim Lapan

Maxim has been working as a software developer for more than 20 years and was involved in various areas: distributed scientific computing, distributed systems and big data processing. Since 2014 he is actively using machine and deep learning to solve practical industrial tasks, such as NLP problems, RL for web crawling and web pages analysis. He has been living in Germany with his family.
Read more about Maxim Lapan