未知异构多智能体系统无模型自适应动态规划同步控制

doi:10.11959/j.issn.2096-6652.202143

智能科学与技术学报 ›› 2021, Vol. 3 ›› Issue (4): 444-448.doi: 10.11959/j.issn.2096-6652.202143

• 专栏：基于数据的学习与优化 • 上一篇下一篇

未知异构多智能体系统无模型自适应动态规划同步控制

夏丽娜¹, 李擎¹, 宋睿卓¹, 王子涵¹, 许镇²

¹ 北京科技大学自动化学院，北京 100083
² 北京科技大学城镇化与城市安全研究院，北京 100083

修回日期:2021-11-09 出版日期:2021-12-15 发布日期:2021-12-01
作者简介:夏丽娜（1993- ），女，北京科技大学自动化学院博士生，主要研究方向为多智能体同步、事件驱动控制、自适应动态规划
李擎（1971- ），男，博士，北京科技大学自动化学院教授、博士生导师，主要研究方向为群智能优化算法及其应用、多智能体同步
宋睿卓（1982- ），女，博士，北京科技大学自动化学院教授、博士生导师。主要从事基于数据的智能计算、复杂系统最优决策与控制等方面的研究工作，并将成果应用于人体信号测控、目标定位与识别等领域。共发表论文85篇，其中SCI论文44篇。第一作者ESI高被引论文3篇。出版学术专著3部，其中英文专著2部。共主持科研项目20项，包括国家自然科学基金项目、北京市自然科学基金项目、横向课题等。发表论文在Google Scholar数据库中总被引1 386次，H指数19。获得2017年中国自动化学会自然科学奖一等奖
王子涵（2000- ），男，北京科技大学自动化学院在读，主要研究方向为自适应动态规划
许镇（1986- ），北京科技大学城镇化与城市安全研究院教授、博士生导师，主要研究方向为城市综合数字防灾的交叉
基金资助:
国家自然科学基金资助项目(61873300);国家自然科学基金资助项目(61722312);中央高校基本科研业务费专项资金项目(FRF-MP-20-11);中央高校基本科研业务费专项资金项目(FRF-IDRY-20-030)

Synchronization control of unknown heterogeneous multi-agent system via model-free adaptive dynamic programming

Lina XIA¹, Qing LI¹, Ruizhuo SONG¹, Zihan WANG¹, Zhen XU²

¹ School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China
² Research Institute of Urbanization and Urban Safety, University of Science and Technology Beijing, Beijing 100083, China

Revised:2021-11-09 Online:2021-12-15 Published:2021-12-01
Supported by:
The National Natural Science Foundation of China(61873300);The National Natural Science Foundation of China(61722312);The Fundamental Research Funds for the Central Universities(FRF-MP-20-11);The Fundamental Research Funds for the Central Universities(FRF-IDRY-20-030)

摘要/Abstract

摘要：

多智能体协同在众多领域逐渐得到应用，但是还有很多仍未解决的问题。为此，研究了基于无模型自适应动态规划（MFADP）的未知异构多智能体系统同步控制问题，其中领航者智能体和追随者智能体动态模型均是未知的。首先为每一个追随者智能体设计观测器来估计领航者智能体的信息，包括领航者智能体状态以及领航者智能体系统动态矩阵。然后利用追随者智能体动态模型信息和观测器动态模型信息构建增广系统，运用贝尔曼最优性原理求得最优控制器以及对应的代数黎卡提方程。在追随者智能体系统动态模型信息未知的情况下，引出了MFADP算法。最后使用双质量-弹簧系统来验证所提算法的有效性。

关键词: 多智能体系统, 代数黎卡提方程, 基于无模型自适应动态规划

Abstract:

Synchronization of multi-agent system has been gradually applied in the most fields, but there are still many unsolved problems.The synchronization control of unknown heterogeneous multi-agent system based on model-free adaptive dynamic programming (MFADP) algorithm was studied.Firstly, an observer was designed for each follower to estimate the information of the leader, including the state and the system dynamic matrix of the leader.Then, the optimal controller was obtained by exploiting the Bellman optimality principle.Under the condition that the dynamics of the follower was unknown, a MFADP algorithm was proposed.Finally, two-mass-spring systems were used to verify the effectiveness of the algorithm.

Key words: multi-agent system, algebraic Riccati equation, model-free adaptive dynamic programming

中图分类号:

TP18

夏丽娜, 李擎, 宋睿卓, 等. 未知异构多智能体系统无模型自适应动态规划同步控制[J]. 智能科学与技术学报, 2021, 3(4): 444-448.

Lina XIA, Qing LI, Ruizhuo SONG, et al. Synchronization control of unknown heterogeneous multi-agent system via model-free adaptive dynamic programming[J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(4): 444-448.

图/表 3

图1

表1

双质量-弹簧系统模型参数"

智能体i	$k_{i}^{1} /(N / m)$	$k_{i}^{2} /(N / m)$	$m_{i}^{1} /kg$	$m_{i}^{2} /kg$
1	1.5	1.05	5.1	10.1
2	1.48	0.9	5.2	9.9
3	1.46	0.95	4.9	10.1
4	1.61	1.1	5.01	10.12
5	1.63	1.12	5.04	10.12
6	1.55	0.9	5.2	10.05

表1

图2

参考文献 9

[1]	LI Z K , REN W , LIU X D ,et al. Distributed consensus of linear multi-agent systems with adaptive dynamic protocols[J]. Automatica, 2013,49(7): 1986-1995.
[2]	李金娜, 程薇燃 . 基于强化学习的数据驱动多智能体系统最优一致性综述[J]. 智能科学与技术学报, 2020,2(4): 327-340.
	LI J N , CHENG W R . An overview of optimal consensus for data driven multi-agent system based on reinforcement learning[J]. Chinese Journal of Intelligent Science and Technology, 2020,2(4): 327-340.
[3]	YANG Y L , MODARES H , WUNSCH D C ,et al. Leader-follower output synchronization of linear heterogeneous systems with active leader using reinforcement learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018,29(6): 2139-2153.
[4]	LI Q , XIA L N , SONG R Z ,et al. Leader-follower bipartite output synchronization on signed digraphs under adversarial factors via data-based reinforcement learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020,31(10): 4185-4195.
[5]	MODARES H , LEWIS F L , JIANG Z P . H_∞tracking control of completely unknown continuous-time systems via off-policy reinforcement learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015,26(10): 2550-2562.
[6]	刘莹莹, 王占山 . 异构多智能体系统的输出同步:一个基于数据的强化学习方法[J]. 智能科学与技术学报, 2020,2(4): 394-400.
	LIU Y Y , WANG Z S . Output synchronization of heterogeneous multi-agent system:a reinforcement learning approach based on data[J]. Chinese Journal of Intelligent Science and Technology, 2020,2(4): 394-400.
[7]	MODARES H , LEWIS F L , KANG W ,et al. Optimal synchronization of heterogeneous nonlinear systems with unknown dynamics[J]. IEEE Transactions on Automatic Control, 2018,63(1): 117-131.
[8]	XIA L N , LI Q , SONG R Z ,et al. Optimal synchronization control of heterogeneous asymmetric input-constrained unknown nonlinear MASs via reinforcement learning[J]. IEEE/CAA Journal of Automatica Sinica, 2021,accepted.
[9]	CAI H , LEWIS F L , HU G Q ,et al. The adaptive distributed observer approach to the cooperative output regulation of linear multi-agent systems[J]. Automatica, 2017,75: 299-305.

未知异构多智能体系统无模型自适应动态规划同步控制

Synchronization control of unknown heterogeneous multi-agent system via model-free adaptive dynamic programming

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 3

参考文献 9

相关文章 3

Metrics

推荐阅读 0

[1]	王涵, 俞扬, 姜远. 基于动态自选择参数共享的合作多智能体强化学习算法[J]. 智能科学与技术学报, 2022, 4(1): 75-83.
[2]	李金娜, 程薇燃. 基于强化学习的数据驱动多智能体系统最优一致性综述[J]. 智能科学与技术学报, 2020, 2(4): 327-340.
[3]	刘莹莹, 王占山. 异构多智能体系统的输出同步：一个基于数据的强化学习方法[J]. 智能科学与技术学报, 2020, 2(4): 394-400.