Abstract:
For multi-agent cooperation tasks in partially observable scenarios, it is difficult for traditional reinforcement learning algorithms to achieve satisfactory performance, while multi-agent reinforcement learning algorithms (MARL) with communication can effectively improve collaboration. We design multi-agent reinforcement learning algorithm based on multi-head attention mechanism. Agents aggregated messages using self-attention and multi-head attention to achieve efficient communication. We used centralized training and distributed execution framework (CTDE) to evaluate and improve policies from a global perspective, so as to improve the quality of decision-making. The experimental results on Traffic-Junction show that our method can effectively improve multi-agent cooperation and achieve better performance than existing algorithms.