Two-Level Actor-Critic Using Multiple Teachers      Poster      Extended Abstract     PDF

Su Zhang, Srijita Das, Sriram Ganapathi Subramanian, and Matthew E. Taylor.

Abstract

Deep reinforcement learning has successfully allowed agents to learn complex behaviors for many tasks. However, a key limitation of current learning approaches is the sample-inefficiency problem, which limits performance of the learning agent. This paper considers how agents can benefit from improved learning via teachers’ advice. In particular, we consider the setting with multiple sub-optimal teachers, as opposed to having a single near-optimal teacher. We propose a flexible two-level actor-critic algorithm where the high-level network learns to choose the best teacher in the current situation while the low-level network learns the control policy.