Today, the Intel® AI Lab is excited to announce the latest Reinforcement Learning Coach release, packed with new features and goodies for the reinforcement learning research community.
Reinforcement Learning Coach is an open source research framework for training and evaluating reinforcement learning agents by harnessing the power of multi-core CPU processing to achieve state-of-the-art results. Coach contains multi-threaded implementations for more than 20 of today’s leading reinforcement learning algorithms, combined with various games and robotics environments. It enables efficient training of reinforcement learning agents on a desktop computer, without requiring any additional hardware.
One of the main challenges when building a research project, or a solution based on a published algorithm, is getting a concrete and reliable baseline that reproduces the algorithm’s results, as reported by its authors. To address this problem, we are releasing a set of benchmarks that shows Reinforcement Learning Coach reliably reproduces many state-of-the-art algorithm results.
Many of the reinforcement learning sub-domains are using more than a single agent in order to solve a particular task. Complicated tasks might require a combination of several agent skills, self-play, or even a hierarchy of agents conveying goals to one another.
One of the promising reinforcement learning research areas is Hierarchical Reinforcement Learning (HRL). Papers such as “Hierarchical Actor-Critic” and “Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation” use agents operating in different time scales or goal complexity to achieve the desired high-level goals.
We have now added multi-agent support to Reinforcement Learning Coach, allowing the invocation of several agents training together. This is first being exemplified by the addition of generic HRL support to Reinforcement Learning Coach and specifically with the implementation of the Hierarchical Actor-Critic paper.
New agent memory types were added, allowing the implementation and reproduction of even more algorithms. Specifically, Reinforcement Learning Coach now also supports Hindsight Experience Replay, fully reproducing the paper results, and Prioritized Experience Replay. These memory types can now also be combined with any of the 21 supported reinforcement learning algorithms to create new ones.
We have also added a generic support for a shared memory between workers, allowing the agents to learn from a much more diversified experience, which as well can be used with any of the algorithms in Reinforcement Learning Coach.
We have integrated Reinforcement Learning Coach with Blizzard*’s StarCraft II using DeepMind*’s pysc2 python wrapper. In addition, we are releasing 2 agent presets – A3C and Dueling DDQN – successfully trained on this newly integrated environment to solve the mineral collection challenge.
Additionally, we have integrated DeepMind’s Control Suite and a few environments which implement toy problems used in latest Reinforcement Learning research.
In this release, Reinforcement Learning Coach has gone through a major refactoring. This was done in order to both improve programmability and code readability. We aim to bring simplicity in the form of code which is both easy to read and easy to understand, and we hope this will allow even more researchers and engineers to break into the field, and innovate.
Several tutorials were added to help new users get started with adding a new environment and with the development of new algorithms using Reinforcement Learning Coach. Defining an HRL graph with Reinforcement Learning Coach is also quite straightforward using our latest release.
Finally, Reinforcement Learning Coach is now also available as a python package through PIP, making the installation even easier than before.
By simplifying the programmability, and by adding new abilities and algorithms, we hope to pave the way to creating new and innovative reinforcement learning algorithms that can help push the field forward. Get started by reviewing the docs and with the help of the tutorials.
Overall, we believe the new additions open up a large number of opportunities to innovate.