Asynchronous Methods for Deep Reinforcement Learning

February 4, 20161602.01783

Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap + 3 more

cs.LG

TLDR

This paper introduces asynchronous variants of deep reinforcement learning algorithms that stabilize training and achieve state-of-the-art results efficiently using parallel actor-learners on CPUs.

Key contributions

Proposes a simple, lightweight asynchronous framework for optimizing deep neural network controllers via gradient descent.
Develops asynchronous versions of four standard RL algorithms, demonstrating improved training stability through parallelism.
Achieves state-of-the-art performance on Atari games and success on continuous control and 3D visual navigation tasks using only CPUs.

Why it matters

This work matters because it significantly improves the efficiency and stability of training deep reinforcement learning agents by leveraging asynchronous parallelism, reducing reliance on expensive GPUs and enabling broader applicability to complex control and navigation problems.

Original Abstract

We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers. We present asynchronous variants of four standard reinforcement learning algorithms and show that parallel actor-learners have a stabilizing effect on training allowing all four methods to successfully train neural network controllers. The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU. Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers