改进奖励函数的深度强化学习路径规划方法

PATH PLANNING METHOD OF DEEP REINFORCEMENT LEARNING WITH IMPROVED REWARD FUNCTION

摘要: 针对深度强化学习在路径规划问题中的稀疏奖励问题，提出一种基于潜能奖励函数的深度强化学习模型。该模型通过设计新的奖励函数，提高奖励密度和样本利用率，减少训练难度，提升智能体在不同地图中的寻路成功率。仿真结果表明，改进后的模型在简单地图上路径规划成功率提高7.08百分点，在复杂地图上规划成功率提高12.60百分点；与最先进的算法对比，寻路成功率近似，但规划路径结果的长度较短。

Abstract: Aimed at the sparse reward problem of deep reinforcement learning in path planning, a deep reinforcement learning model based on potential reward function is proposed. By designing a new reward function, the model improved the reward density and sample utilization, reduced the difficulty of training, and improved the success rate of agent in different maps. The simulation results show that the path planning success rate of the improved model is improved by 7.08 percentage points on simple maps and 12.60 percentage points on complex maps. Compared with the most advanced algorithms, the routing success rate is similar, but the length of the planned path result is shortened.