Index
Symbols
3×3 cube model
A
A2C
using, on Pong 318, 319, 320, 321, 322, 323, 324
using, on Pong results 324, 325, 326, 327
with data parallelism 334
with gradients parallelism 334
A2C method
about 505
models, used for video recording 512
A3C, with data parallelism
about 336
implementation 336, 338, 339, 340, 341, 342, 343, 344
result 344
A3C, with with gradients parallelism
implementation 347, 348, 349, 350, 351, 352
results 352
ACKTR
about 616
implementation 617
actions 10
action selectors, cases
argmax 166
policy-based 166
action space 22
actor-critic method 638
advantage 316
Adam algorithm 322
advantage actor-critic (A2C...