AAMAS 2019 MARL-PPS

MARL-PPS: Multi-agent Reinforcement Learning with Periodic Parameter Sharing

Safa Cicek Alireza Nakhaei Stefano Soatto Kikuo Fujimura.

Autonomous Agents and Multi Agent Systems (AAMAS) 2019

We present a multi-agent reinforcement learning algorithm that is a simple, yet effective modification of a known algorithm. External agents are modeled as a time-varying environment, whose policy parameters are updated periodically at a slower rate than the planner to make learning stable and more efficient. Replay buffer, which is used to store the experiences, is also reset with the same large period to draw samples from a fixed environment. This enables us to address challenging cooperative control problems in highway navigation. The resulting Multi-agent Reinforcement Learning with Periodic Parameter Sharing (MARL-PPS) algorithm outperforms the baselines in multi-agent highway scenarios we tested.

Downloadable item