In this chapter, we studied the best reinforcement learning architecture at the moment, that is AlphaGo. We understood the reason behind choosing Go and its complexity with respect to chess. We also learnt how DeepBlue AI architecture works and how a different and better architecture and training process is needed for Go. We studied the architectures and training processes used in AlphaGo and AlphaGo Zero, and also understood the differences between the versions and how AlphaGo Zero surpassed its earlier versions.
In the next chapter, we will study how reinforcement learning can be used and implemented in autonomous and self-driving cars.