[1] |
胡照, 王敏, 袁俊刚 . 国外全电推进卫星平台的发展及启示[J]. 航天器环境工程, 2015,32(5): 566-570.
|
|
HU Z , WANG M , YUAN J G . A review of the development of all-electric propulsion platform in the world[J]. Spacecraft Environment Engineering, 2015,32(5): 566-570.
|
[2] |
DONALD B , LYNCH K , RUS D . Algorithmic and computational robotics[M].[S.l.]: A K Peters/CRC Press, 2001.
|
[3] |
KAVRAKI L E , SVESTKA P , LATOMBE J C ,et al. Probabilistic roadmaps for path planning in high-dimensional configuration spaces[J]. IEEE Transactions on Robotics and Automation, 1996,12(4): 566-580.
|
[4] |
AMARJYOTI S . Deep reinforcement learning for robotic manipulation-the state of the art[J]. arXiv preprint,2017,arXiv:1701.08878.
|
[5] |
多南讯, 吕强, 林辉灿 ,等. 迈进高维连续空间:深度强化学习在机器人领域中的应用[J]. 机器人, 2019,41(2): 276-288.
|
|
DUO N X , LYU Q , LIN H C ,et al. Step into high-dimensional and continuous action space:a survey on applications of deep reinforcement learning to robotics[J]. Robot, 2019,41(2): 276-288.
|
[6] |
GU S X , HOLLY E , LILLICRAP T ,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates[C]// Proceedings of 2017 IEEE International Conference on Robotics and Automation. Piscataway:IEEE Press, 2017: 3389-3396.
|
[7] |
ZHU Y K , WANG Z Y , MEREL J ,et al. Reinforcement and imitation learning for diverse visuomotor skills[J]. arXiv preprint,2018,arXiv:1802.09564.
|
[8] |
SANGIOVANNI B , INCREMONA G P , PIASTRA M ,et al. Self-configuring robot path planning with obstacle avoidance via deep reinforcement learning[J]. IEEE Control Systems Letters, 2021,5(2): 397-402.
|
[9] |
SANGIOVANNI B , RENDINIELLO A , INCREMONA G P ,et al. Deep reinforcement learning for collision avoidance of robotic manipulators[C]// Proceedings of 2018 European Control Conference. Piscataway:IEEE Press, 2018.
|
[10] |
KURUTACH T , CLAVERA I , DUAN Y ,et al. Model-ensemble trust-region policy optimization[J]. arXiv preprint,2018,arXiv:1802.10592.
|
[11] |
SCHULMAN J , WOLSKI F , DHARIWAL P ,et al. Proximal policy optimization algorithms[J]. arXiv preprint,2017,arXiv:1707.06347.
|
[12] |
ZHANG Y Z , CLAVERA I , TSAI B ,et al. Asynchronous methods for model-based reinforcement learning[J]. arXiv preprint,2019,arXiv:1910.12453.
|
[13] |
SILVER D , LEVER G , HEESS N ,et al. Deterministic policy gradient algorithms[C]// Proceedings of the 31st International Conference on Machine Learning.[S.l.:s.n.], 2014: 387-395.
|
[14] |
LILLICRAP T P , HUNT J J , PRITZEL A ,et al. Continuous control with deep reinforcement learning[J]. arXiv preprint,2015,arXiv:1509.02971.
|
[15] |
BARTH-MARON G , HOFFMAN M W , BUDDEN D ,et al. Distributed distributional deterministic policy gradients[J]. arXiv preprint,2018,arXiv:1804.08617.
|
[16] |
FUJIMOTO S , VAN HOOF H , MEGER D . Addressing function approximation error in actor-critic methods[J]. arXiv preprint,2018,arXiv:1802.09477.
|
[17] |
TASSA Y , DORON Y , MULDAL A ,et al. DeepMind control suite[J]. arXiv preprint,2018,arXiv:1801.00690.
|
[18] |
QURESHI A H , SIMEONOV A , BENCY M J ,et al. Motion planning networks[C]// Proceedings of 2019 International Conference on Robotics and Automation. Piscataway:IEEE Press, 2019: 2118-2124.
|
[19] |
ZHONG J , WANG T , CHENG L L . Collision-free path planning for welding manipulator via hybrid algorithm of deep reinforcement learning and inverse kinematics[J]. Complex & Intelligent Systems, 2021: 1-14.
|
[20] |
MIN C H , SONG J B . End-to-end robot manipulation using demonstration-guided goal strategie[C]// Proceedings of 2019 16th International Conference on Ubiquitous Robots. Piscataway:IEEE Press, 2019: 159-164.
|
[21] |
XU J , HOU Z M , WANG W ,et al. Feedback deep deterministic policy gradient with fuzzy reward for robotic multiple peg-in-hole assembly tasks[C]// Proceedings of the IEEE Transactions on Industrial Informatics. Piscataway:IEEE Press, 2019: 1658-1667.
|
[22] |
ZHU Z Y , HU H S . Robot learning from demonstration in robotic assembly:a survey[J]. Robotics, 2018,7(2): 17.
|
[23] |
刘维惠, 陈殿生, 张立志 . 利用示教学习的移动机械臂轨迹避障算法[J]. 哈尔滨工程大学学报, 2018,39(9): 1546-1553.
|
|
LIU W H , CHEN D S , ZHANG L Z . Learning from demonstration based obstacle avoidance algorithm to plan the trajectory of a mobile manipulator[J]. Journal of Harbin Engineering University, 2018,39(9): 1546-1553.
|
[24] |
ABDO N , KRETZSCHMAR H , SPINELLO L ,et al. Learning manipulation actions from a few demonstrations[C]// Proceedings of 2013 IEEE International Conference on Robotics and Automation. Piscataway:IEEE Press, 2013: 1268-1275.
|
[25] |
HEESS N , TB D , SRIRAM S ,et al. Emergence of locomotion behaviours in rich environments[J]. arXiv preprint,2017,arXiv:1707.02286.
|