基于深度强化学习的无人船避障研究

OBSTACLE AVOIDANCE OF UNMANNED SURFACE VEHICLES BASED ON DEEP REINFORCEMENT LEARNING

  • 摘要: 为提高无人船航行的安全性,提出一种基于改进双延迟深度确定性策略梯度算法(Twin Delayed Deep Deterministic Policy Gradient, TD3)的智能避障算法D_TD3。提出混合风险评估模型,使用无人船运动参数、海上避碰规则等,构建动态船域,对碰撞风险度进行分层计算。根据碰撞风险度、避碰规则等设计具有指导性的奖励函数。提出带有混合经验池的优先级采样方法提高算法效率。实验结果表明D_TD3算法能够有效地实现航行任务,成功率达到85%以上,平均导航时间缩短,导航的鲁棒性显著提高。

     

    Abstract: In order to improve the safety of unmanned surface vehicles navigation, an intelligent obstacle avoidance algorithm D_TD3 based on the improved twin delayed deep deterministic policy gradient (TD3) is proposed. A hybrid risk assessment model was proposed, and a dynamic ship domain was constructed using unmanned ship motion parameters and maritime collision avoidance rules, and the collision risk degree was calculated hierarchically. According to the collision risk degree, collision avoidance rules, etc., a guiding reward function was designed. A priority sampling method with mixed experience pool was proposed to improve the efficiency of the algorithm. Experimental results show that the D_TD3 algorithm can effectively realize the navigation mission, the success rate reaches more than 85%, the average navigation time is shortened, and the robustness of navigation is significantly improved.

     

/

返回文章
返回