2025年03期 v.46 23-29页
黄兆军 张彦佳 左晓雯 陈泽汛
(珠海城市职业技术学院,广东 珠海 519090)
摘要:针对深度确定性策略梯度(DDPG)算法用于无人遥控有缆水下机器人(ROV)运动控制时,存在学习时间长且难以收敛等问题,提出基于监督式DDPG算法的小型ROV运动控制方法。在DDPG算法的初始学习阶段引入监督学习算法,通过专家经验引导,加快神经网络收敛速度,缩短学习时间。仿真试验结果表明,监督式DDPG算法比DDPG算法的控制效果更好。
关键词:监督式DDPG;小型ROV;运动控制;专家经验;强化学习
中图分类号:TP242.3 文献标志码:A 文章编号:1674-2605(2025)03-0004-07
DOI:10.12475/aie.20250304 开放获取
Motion Control Method for Small ROV Based on Supervised DDPG Algorithm
HUANG Zhaojun ZHANG Yanjia ZUO Xiaowen CHEN Zexun
(Zhuhai City Polytechnic, Zhuhai 519090, China)
Abstract: To address the issues of prolonged learning time and difficulty in convergence when using the Deep Deterministic Policy Gradient (DDPG) algorithm for motion control of remotely operated tethered underwater vehicles (ROVs), this paper proposes a supervised DDPG-based motion control method for small ROVs. During the initial learning phase of the DDPG algorithm, a supervised learning approach is introduced to accelerate neural network convergence and reduce learning time by leveraging expert experience. Simulation results demonstrate that the supervised DDPG algorithm achieves superior control performance compared to the standard DDPG algorithm.
Keywords: supervised DDPG; small ROV; motion control; expert experience; reinforcement learning