The Q-learning algorithm, as we saw in Chapter 4, Q-Learning and SARSA Applications, has many qualities that enable its application in many real-world contexts. A key ingredient of this algorithm is that it makes use of the Bellman equation for learning the Q-function. The Bellman equation, as used by the Q-learning algorithm, enables the updating of Q-values from subsequent state-action values. This makes the algorithm able to learn at every step, without waiting until the trajectory is completed. Also, every state or action-state pair has its own values stored in a lookup table that saves and retrieves the corresponding values. Being designed in this way, Q-learning converges to optimal values as long as all the state-action pairs are repeatedly sampled. Furthermore, the method uses two policies: a non-greedy behavior policy to gather experience...
![country flag](/images/countries/us.png)
![country flag](/images/countries/gb.png)
![country flag](/images/countries/india.png)
![country flag](/images/countries/germany.png)
![country flag](/images/countries/france.png)
![country flag](/images/countries/canada.png)
![country flag](/images/countries/russia.png)
![country flag](/images/countries/spain.png)
![country flag](/images/countries/brazil.png)
![country flag](/images/countries/australia.png)
![country flag](/images/countries/argentina.png)
![country flag](/images/countries/austria.png)
![country flag](/images/countries/belgium.png)
![country flag](https://cdn.packtpub.com/flag/004da5c4-583f-4e69-a0a8-0e999f350f90_128px-Flag_of_Bulgaria.png)
![country flag](/images/countries/chile.png)
![country flag](/images/countries/colombia.png)
![country flag](/images/countries/cyprus.png)
![country flag](/images/countries/czech.png)
![country flag](/images/countries/denmark.png)
![country flag](/images/countries/ecuador.png)
![country flag](/images/countries/egypt.png)
![country flag](/images/countries/estonia.png)
![country flag](/images/countries/finland.png)
![country flag](/images/countries/greece.png)
![country flag](/images/countries/hungary.png)
![country flag](/images/countries/indonesia.png)
![country flag](/images/countries/ireland.png)
![country flag](/images/countries/italy.png)
![country flag](/images/countries/japan.png)
![country flag](/images/countries/latvia.png)
![country flag](/images/countries/lithuania.png)
![country flag](/images/countries/lux.png)
![country flag](/images/countries/malaysia.png)
![country flag](/images/countries/malta.png)
![country flag](/images/countries/mexico.png)
![country flag](/images/countries/netherlands.png)
![country flag](/images/countries/newzealand.png)
![country flag](/images/countries/norway.png)
![country flag](/images/countries/philippines.png)
![country flag](/images/countries/poland.png)
![country flag](/images/countries/portugal.png)
![country flag](/images/countries/romania.png)
![country flag](/images/countries/singapore.png)
![country flag](/images/countries/slovakia.png)
![country flag](/images/countries/slovenia.png)
![country flag](/images/countries/southafrica.png)
![country flag](/images/countries/southkorea.png)
![country flag](/images/countries/sweden.png)
![country flag](/images/countries/switzerland.png)
![country flag](/images/countries/taiwan.png)
![country flag](/images/countries/thailand.png)
![country flag](/images/countries/turkey.png)
![country flag](/images/countries/ukraine.png)