基于多头注意力机制通信的多智能体强化学习算法

MULTI-AGENT REINFORCEMENT LEARNING ALGORITHM BASED ON MULTI-HEAD ATTENTION COMMUNICATION

  • 摘要: 对于多智能体视野受限场景下的合作任务,传统强化学习算法很难取得满意表现,而采用消息通信的算法可以有效提高协作水平,因此提出一种基于多头注意力机制通信的多智能体强化学习算法。智能体结合注意力机制与多头注意力机制聚合消息,达成高效的信息交流;采用中心化训练分布式执行架构,从全局角度评估策略价值,提升决策质量。在经典仿真环境Traffic-Junction上的实验结果表明,该算法能有效提高多智能体的合作水平,性能优于现有算法。

     

    Abstract: For multi-agent cooperation tasks in partially observable scenarios, it is difficult for traditional reinforcement learning algorithms to achieve satisfactory performance, while multi-agent reinforcement learning algorithms (MARL) with communication can effectively improve collaboration. We design multi-agent reinforcement learning algorithm based on multi-head attention mechanism. Agents aggregated messages using self-attention and multi-head attention to achieve efficient communication. We used centralized training and distributed execution framework (CTDE) to evaluate and improve policies from a global perspective, so as to improve the quality of decision-making. The experimental results on Traffic-Junction show that our method can effectively improve multi-agent cooperation and achieve better performance than existing algorithms.

     

/

返回文章
返回