基于深度强化学习算法的自主式水下航行器深度控制

doi:10.11959/j.issn.2096-6652.202038

Abstract

Abstract:

The depth control problem of autonomous underwater vehicle (AUV) by using deep reinforcement learning method was mainly studied.Different from the traditional control algorithm, the deep reinforcement learning method allows the AUV to learn the control law independently, avoiding the artificial establishment of accurate model and design control law.The deep deterministic policy gradient method was used to design two neural networks: actor and critic.Actor neural network enabled agents to make corresponding control actions.Critic neural network was used to estimate the action-value function in reinforcement learning.The AUV depth control was conducted by training of actor and critic neural networks.The effectiveness of the algorithm was proved by simulation on OpenAI Gym.

Key words: autonomous underwater vehicle, depth control, deep reinforcement learning

CLC Number:

TP242.6

Rizhong WANG, Huiping LI, Di CUI, et al. Depth control of autonomous underwater vehicle using deep reinforcement learning[J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 354-360.

Figures/Tables 8

References 16

[1]	ZHANG L J , QI X , PANG Y J . Adaptive output feedback control based on DRFNN for AUV[J]. Ocean Engineering, 2009,36(9-10): 716-722.
[2]	SUTTON R , BARTO A . Reinforcement learning:an introduction[M]. Cambridge: MIT Press, 1998.
[3]	TESAURO G . TD-Gammon,a self-teaching backgammon program,achieves master-level play[J]. Neural Computation, 1944,6(2): 215-219.
[4]	MNIH V , KAVUKCUOGLU K , SILVER D ,et al. Playing atari with deep reinforcement learning[J]. Computer Science, 2013.
[5]	SILVER D , HUANG A , MADDISON C J ,et al. Mastering the game of go with deep neural networks and tree search[J]. Nature, 2016,529(7587): 484-489.
[6]	SILVER D , TECHNOLOGIES D , LEVER G ,et al. Deterministic policy gradient algorithms[C]// International Conference on Machine Learning. New York:ACM Press, 2014.
[7]	LILLICRAP T P , HUNT J J , PRITZEL A ,et al. Continuous control with deep reinforcement learning[J]. Computer Science, 2015,6(6): A187.
[8]	HWANGBO J , LEE J , DOSOVITSKIY A ,et al. Learning agile and dynamic motor skills for legged robots[J]. Science Robotics, 2019,4(26).
[9]	严卫生 . 鱼雷航行力学[M]. 西安: 西北工业大学出版社, 2005.
	YAN W S . Torpedo navigation mechanics[M]. Xi’an: Northwestern Polytechnical University Press, 2005.
[10]	GOODFELLOW I , BENGIO Y , COURVILLE A . Deep learning[M]. Cambridge: MIT Press, 2016.
[11]	KINGMA D , BA J . ADAM:a method for stochastic optimization[J]. Computer Science, 2014.
[12]	MNIH V , KAVUKCUOGLU K , SILVER D ,et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518-529.
[13]	KONDA V , TSITSIKLIS J . Actor-critic algorithms[J]. In Advances in Neural Information Processing Systems, 2003: 1008-1014.
[14]	UHLENBECK G E , ORNSTEIN L S . On the theory of the Brownian motion[J]. Revista Latinoamericana De Microbiología, 1973,15(1): 29.
[15]	ABADI M , BARHAM P , CHEN P ,et al. TensorFlow:a system for large-scale machine learning[J]. Google Brain, 2016.
[16]	NAIR V , HINTON G . Rectified linear units improve restricted Boltzmann machines[C]// International Conference on Machine Learning.[S.l.:s.n.], 2010.

Metrics

Recommended 0

No Suggested Reading articles found!

Depth control of autonomous underwater vehicle using deep reinforcement learning

RichHTML

PDF下载

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 8

References 16

Related Articles 11

Metrics

Recommended 0

[1]	Jiaxin ZHANG, Senlin ZHANG, Meiqin LIU, Shanling DONG, Ronghao ZHENG. Multi-AUV cooperative localization in adaptive sampling for marine environmental monitoring [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(4): 503-512.
[2]	Zhen SU, Dianyong LIU, Dazhi SUN, Xiao LIANG. Path parameter consensus-based formation and obstacle avoidance control of Special Topic: Autonomous Underwater Vehicles [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(4): 533-541.
[3]	Shuai MA, Qiming FU, Jianping CHEN, Fan FENG, You LU, Zhengwei LI, Shunian QIU. HVAC model-free optimal control method based on double-pools DQN [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(3): 426-444.
[4]	Jiacheng LIU, Xiangwen ZHANG. TD3-based energy management strategy for hybrid energy storage system of electric vehicle [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(2): 277-287.
[5]	Yuxiang SUN, Yihui PENG, Bin LI, Jiawei ZHOU, Xinlei ZHANG, Xianzhong ZHOU. Overview of intelligent game:enlightenment of game AI to combat deduction [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(2): 157-173.
[6]	Pu FENG, Wenjun WU, Jie LUO, Xin YU, Yongkai TIAN. Emergence measurement of robot swarm intelligence based on swarm entropy [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(1): 65-74.
[7]	Zhiqiang HU. The framework model on internal mechanism of big data intelligent command and control [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(1): 101-109.
[8]	Zhaoyang LIU, Chaoxu MU, Changyin SUN. An overview on algorithms and applications of deep reinforcement learning [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 314-326.
[9]	Tao LI, Qinglai WEI. Intelligent heating temperature control system based on deep reinforcement learning [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 348-353.
[10]	Huiqiao FU, Kaiqiang TANG, Guizhou DENG, Xinpeng WANG, Chunlin CHEN. Motion planning for hexapod robot using deep reinforcement learning [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 361-371.
[11]	Yu SHEN,Jinpeng HAN,Lingxi LI,Fei-Yue WANG. AI in game intelligence—from multi-role game to parallel game [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(3): 205-213.