Reader small image

You're reading from  Deep Reinforcement Learning Hands-On. - Second Edition

Product typeBook
Published inJan 2020
Reading LevelIntermediate
PublisherPackt
ISBN-139781838826994
Edition2nd Edition
Languages
Right arrow
Author (1)
Maxim Lapan
Maxim Lapan
author image
Maxim Lapan

Maxim has been working as a software developer for more than 20 years and was involved in various areas: distributed scientific computing, distributed systems and big data processing. Since 2014 he is actively using machine and deep learning to solve practical industrial tasks, such as NLP problems, RL for web crawling and web pages analysis. He has been living in Germany with his family.
Read more about Maxim Lapan

Right arrow

Atari experiments

The MountainCar environment is a nice and fast way to experiment with exploration methods, but to conclude the chapter, I've included Atari versions of the DQN and PPO methods with the exploration tweaks we described. As the primary environment, I've used Seaquest, which is a game where the submarine needs to shoot fish and enemy submarines, and save aquanauts. This game is not as famous as Montezuma's Revenge, but it still might be considered as medium-hard exploration, because to continue the game, you need to control the level of oxygen. When it becomes low, the submarine needs to rise to the surface for some time. Without this, the episode will end after 560 steps and with a maximum reward of 20. But once the agent learns how to replenish the oxygen, the game might continue almost infinitely and bring to the agent a 10k-100k score. Surprisingly, traditional exploration methods struggle with discovering this; normally, training gets stuck at 560...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Deep Reinforcement Learning Hands-On. - Second Edition
Published in: Jan 2020Publisher: PacktISBN-13: 9781838826994

Author (1)

author image
Maxim Lapan

Maxim has been working as a software developer for more than 20 years and was involved in various areas: distributed scientific computing, distributed systems and big data processing. Since 2014 he is actively using machine and deep learning to solve practical industrial tasks, such as NLP problems, RL for web crawling and web pages analysis. He has been living in Germany with his family.
Read more about Maxim Lapan