Sample Efficient Actor-Critic with Experience Replay (original) (raw)

View PDF

Abstract:This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method.

Submission history

From: Ziyu Wang [view email]
[v1] Thu, 3 Nov 2016 23:21:32 UTC (1,409 KB)
[v2] Mon, 10 Jul 2017 14:38:10 UTC (2,708 KB)