Chinese Journal of Intelligent Science and Technology ›› 2022, Vol. 4 ›› Issue (2): 223-232.doi: 10.11959/j.issn.2096-6652.202225

• Special Topic: Autonomous Agent Learning for Dexterous and Accurate Manipulations • Previous Articles    

Research on the manipulator intelligent trajectory planning method based on the improved TD3 algorithm

Qiang ZHANG, Wen WEN, Xiaodong ZHOU, Weihui LIU, Xiaoyu CHU   

  1. Beijing Institute of Control Engineering, Beijing 100190, China
  • Online:2022-06-01 Published:2022-06-01
  • Supported by:
    The National Key Research and Development Program of China(2018AAA0103004)


An intelligent trajectory planning and obstacle avoidance method based on the improved twin delayed deep deterministic policy gradient algorithm (TD3) was proposed to solve the trajectory planning problem for a 4-DOF manipulator mounted on a satellite.The training strategy had 2 periods.In the pre-training stage, the target position was always guided combining with the output of the strategy network to optimize the trajectory.After the pre-training, the algorithm can autonomously output the velocity trajectory while the initial position and the target were specified randomly in the joint space of the manipulator.This target-guided mechanism decreased the unnecessary explorations and improved the learning efficiency in high dimensional action space.In the second training stage, a collision-free safety reference trajectory was firstly obtained by demonstration, and then this trajectory was constantly learned during the training process until the final output trajectory has the ability to avoid obstacles.

Key words: obstacle avoidance planning, target-guide, trajectory demonstration, double training

CLC Number: 

No Suggested Reading articles found!