NeurIPS2024 LangMarl - Honda Research Institute USA

NeurIPS2024 LangMarl

Language Grounded Multi-agent Reinforcement Learning with Human-interpretable Communication

Huao Li Hossein Nourkhiz Mahjoub Behdad Chalaki Vaishnav Tadiparthi Kwonjoon Lee Ehsan Moradi-Pari Charles Michael Lewis Katia P. Sycara

NeurIPS 2024

Multi-Agent Reinforcement Learning (MARL) methods have shown promise in enabling agents to learn a shared communication protocol from scratch and accomplish challenging team tasks. However, the learned language is usually not interpretable to humans or other agents not co-trained together, limiting its applicability in ad-hoc teamwork scenarios. In this work, we propose a novel computational pipeline that aligns the communication space between MARL agents with an embedding space of human natural language by grounding agent communications on synthetic data generated by embodied Large Language Models (LLMs) in interactive teamwork scenarios. Our results demonstrate that introducing language grounding not only maintains task performance but also accelerates the emergence of communication. Furthermore, the learned communication protocols exhibit zero-shot generalization capabilities in ad-hoc teamwork scenarios with unseen teammates and novel task states. This work presents a significant step toward enabling effective communication and collaboration between artificial agents and humans in real-world teamwork settings.

Downloadable item