Jethro's Braindump

Distributed Reinforcement Learning

Parallelizing Reinforcement Learning ⭐.

History of Distributed RL

  1. DQN (NO_ITEM_DATA:mnih2013playing): Playing Atari with Deep RL
  2. GORILA (Nair et al., n.d.)
  3. A3C (Mnih et al., n.d.)
  4. IMPALA (Espeholt et al., n.d.)
  5. Ape-X (Horgan et al., n.d.)
  6. R2D3 (Paine et al., n.d.)

Resources

Bibliography

Espeholt, Lasse, Hubert Soyer, Remi Munos, Karen Simonyan, Volodymir Mnih, Tom Ward, Yotam Doron, et al. n.d. “Impala: Scalable Distributed Deep-Rl with Importance Weighted Actor-Learner Architectures.” http://arxiv.org/abs/1802.01561v3.

Horgan, Dan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, and David Silver. n.d. “Distributed Prioritized Experience Replay.” http://arxiv.org/abs/1803.00933v1.

Mnih, Volodymyr, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. n.d. “Asynchronous Methods for Deep Reinforcement Learning.” http://arxiv.org/abs/1602.01783v2.

Nair, Arun, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Vedavyas Panneershelvam, et al. n.d. “Massively Parallel Methods for Deep Reinforcement Learning.” http://arxiv.org/abs/1507.04296v2.

Paine, Tom Le, Caglar Gulcehre, Bobak Shahriari, Misha Denil, Matt Hoffman, Hubert Soyer, Richard Tanburn, et al. n.d. “Making Efficient Use of Demonstrations to Solve Hard Exploration Problems.” http://arxiv.org/abs/1909.01387v1.

NO_ITEM_DATA:mnih2013playing