智能科学与技术学报 ›› 2021, Vol. 3 ›› Issue (4): 449-455.doi: 10.11959/j.issn.2096-6652.202144

• 专栏:基于数据的学习与优化 • 上一篇    下一篇

一类非仿射系统的执行依赖启发式在线跟踪控制

赵慧玲1,2,3,4, 王鼎1,2,3,4, 任进1,2,3,4   

  1. 1 北京工业大学信息学部,北京 100124
    2 计算智能与智能系统北京市重点实验室,北京 100124
    3 北京人工智能研究院,北京 100124
    4 智慧环保北京实验室,北京 100124
  • 修回日期:2021-11-20 出版日期:2021-12-15 发布日期:2021-12-01
  • 作者简介:赵慧玲(1998– ),女,北京工业大学信息学部硕士生,主要研究方向为强化学习与智能控制
    王鼎(1984– ),男,博士,北京工业大学信息学部教授、博士生导师,主要研究方向为强化学习与智能控制
    任进(1999– ),女,北京工业大学信息学部硕士生,主要研究方向为强化学习与智能控制
  • 基金资助:
    国家自然科学基金资助项目(61773373);国家自然科学基金资助项目(61890930-5);国家自然科学基金资助项目(62021003);国家重点研发计划资助项目(2021ZD0112300-2);国家重点研发计划资助项目(2018YFC1900800-5);北京市自然科学基金资助项目(JQ19013)

Action dependent heuristic online tracking control for a class of nonaffine systems

Huiling ZHAO1,2,3,4, Ding WANG1,2,3,4, Jin REN1,2,3,4   

  1. 1 Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
    2 Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing 100124, China
    3 Beijing Institute of Artificial Intelligence, Beijing 100124, China
    4 Beijing Laboratory of Smart Environmental Protection, Beijing 100124, China
  • Revised:2021-11-20 Online:2021-12-15 Published:2021-12-01
  • Supported by:
    The National Natural Science Foundation of China(61773373);The National Natural Science Foundation of China(61890930-5);The National Natural Science Foundation of China(62021003);The National Key Research and Development Project of China(2021ZD0112300-2);The National Key Research and Development Project of China(2018YFC1900800-5);Beijing Natural Science Foundation(JQ19013)

摘要:

针对非仿射系统的跟踪控制问题,提出一种基于执行依赖启发式动态规划(ADHDP)结构的在线设计方法。首先,考虑一类未知非仿射系统,进行从跟踪控制问题到误差调节问题的转换;然后,设计基于 ADHDP结构的跟踪控制器,并采用在线学习的方式,实现系统控制与执行网络和评判网络的训练同步,使得系统状态能够跟踪期望轨迹;最后,通过一个仿真实例验证所提方法的有效性。

关键词: 跟踪控制, 在线学习, 执行依赖设计

Abstract:

To solve the tracking control problem for nonaffine systems, an online design method was developed by using the action dependent heuristic dynamic programming (ADHDP) structure.Firstly, the tracking control problem for the unknown nonaffine system was transformed into the error regulation problem.Then, the ADHDP tracking controller was designed and the online learning method was adopted to synchronize the system control with the training of action networks and critic networks, so that the desired trajectory could be tracked by the system state.Finally, a simulation example was given to verify the effectiveness of the proposed method.

Key words: tracking control, online learning, action dependent design

中图分类号: 

No Suggested Reading articles found!