Chinese Journal of Intelligent Science and Technology ›› 2020, Vol. 2 ›› Issue (4): 327-340.doi: 10.11959/j.issn.2096-6652.202035
• Special Issue: Deep Reinforcement Learning • Previous Articles Next Articles
Jinna LI, Weiran CHENG
Revised:
2020-12-03
Online:
2020-12-15
Published:
2020-12-01
Supported by:
CLC Number:
Jinna LI,Weiran CHENG. An overview of optimal consensus for data driven multi-agent system based on reinforcement learning[J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 327-340.
[1] | XU Y , LIU W . Novel multiagent based load restoration algorithm for microgrids[J]. IEEE Transactions on Smart Grid, 2011,2(1): 152-161. |
[2] | YU C , REN F . Collective learning for the emergence of social norms in networked multiagent systems[J]. IEEE Transactions on Cybernetics, 2014,44(12): 2342-2355. |
[3] | WEN G , HU G , YU W ,et al. Distributed H∞ consensus of higher order multiagent systems with switching topologies[J]. IEEE Transactions on Circuits and Systems-II:Express Briefs, 2014,61(5): 359-363. |
[4] | SU S , LIN Z . Distributed consensus control of multi-agent systems with higher order agent dynamics and dynamically changing directed interaction topologies[J]. IEEE Transactions on Automatic Control, 2016,61(2): 515-519. |
[5] | ZHANG H , YUE D , YIN X ,et al. Finite-time distributed event-triggered consensus control for multi-agent systems[J]. Journal of Information Science, 2016,339: 132-142. |
[6] | REHAN M , JAMEEL A , AHN C K . Distributed consensus control of one-sided Lipschitz nonlinear multiagent systems[J]. IEEE Transactions on Systems,Man and Cybernetics, 2018,48(8): 1297-1308. |
[7] | OLFATI-SABER R . Flocking for multi-agent dynamic systems:algorithms and theory[J]. IEEE Transactions on Automatic Control, 2006,51(3): 401-420. |
[8] | DOU C , YUE D , GUERRERO J M ,et al. Multi-agent system-based distributed coordinated control for radial DC microgrid considering transmission time delays[J]. IEEE Transactions on Smart Grid, 2017,8(5): 2370-2381. |
[9] | BANERJEE B . A new self-training-based unsupervised satellite image classification technique using cluster ensemble strategy[J]. IEEE Geoscience and Remote Sensing Letters, 2015,12(4): 741-745. |
[10] | ZHAO W , LI R , ZHANG H . Finite-time distributed formation tracking control of multi-UAVs with a time-varying reference trajectory[J]. IMA Journal of Mathematical Control and Information, 2018,35(4): 1297-1318. |
[11] | GE X , HAN Q L . Distributed formation control of networked multi-agent systems using a dynamic event-triggered communication mechanism[J]. IEEE Transactions on Industrial Electronics, 2017,64(10): 8118-8127. |
[12] | LEWIS F L , ZHANG H HENGSTER-MOVRIC K ,et al. Cooperative control of multi-agent systems:optimal and adaptive design approaches[M]. London: Springer-Verlag, 2014. |
[13] | WANG C , WANG X , JI H . A continuous leader-following consensus control strategy for a class of uncertain multi-agent systems[J]. IEEE/CAA Journal of Automatica Sinica, 2014,1(2): 187-192. |
[14] | DONG W . Distributed optimal control of multiple systems[J]. International Journal of Control, 2010,83(10): 2067-2079. |
[15] | ABOUHEAF M , LEWIS F L , VAMVOUDAKIS K ,et al. Multi-agent discrete-time graphical games andreinforcement learning solutions[J]. Automatica, 2014,50(12): 3038-3053. |
[16] | AL-TAMIMI A , LEWIS F L , ABU-KHALAF M . Discrete-time nonlinear HJB solution using approximate dynamic programming:convergence proof[J]. IEEE Transactions on Systems,Man,and Cybernetics,Part B (Cybernetics), 2008,38(4): 943-949. |
[17] | TATARI F , NAGHIBI-SISTANI M B , VAMVOUDAKIS K G . Distributed learning algorithm for non-linear differential graphical games[J]. Transactions of the Institute of Measurement and Control, 2017,39(2): 173-182. |
[18] | WEI Q , LIU D , LEWIS F L . Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games[J]. Information Sciences, 2015,317: 96-113. |
[19] | REN W , MOORE K L , CHEN Y . High-order and model reference consensus algorithms in cooperative control of multivehicle systems[J]. Journal of Dynamic Systems, 2007,129(5): 678-688. |
[20] | LI Z , REN W , LIU X ,et al. Distributed consensus of linear multi-agent systems with adaptive dynamic protocols[J]. Automatica, 2013,49(7): 1986-1995. |
[21] | ZHAO Y , DUAN Z , WEN G ,et al. Distributed finite-time tracking of multiple non-identical second-order nonlinear systems with settling time estimation[J]. Automatica, 2016,64(C): 86-93. |
[22] | KIM H , SHIM H , SEO J H . Output consensus of heterogeneous uncertain linear multi-agent systems[J]. IEEE Transactions on Automatic Control, 2011,56(1): 200-206. |
[23] | YAGHMAIE F A , LEWIS F L , SU R . Output regulation of linear heterogeneous multi-agent systems via output and state feedback[J]. Automatica, 2016,67: 157-164. |
[24] | 曹伟, 孙明 . 离散时变多智能体系统有限时间一致性迭代学习控制[J]. 控制与决策, 2019,34(4): 891-896. |
CAO W , SUN M . Finite-time consensus iterative learning control of discrete time-varying multi-agent systems[J]. Control and Decision, 2019,34(4): 891-896. | |
[25] | 徐君, 张国良, 曾静 ,等. 具有时延和切换拓扑的高阶离散时间多智能体系统鲁棒保性能一致性[J]. 自动化学报, 2019,45(2): 360-373. |
XU J , ZHANG G L , ZENG J ,et al. Robust guaranteed cost consensus for high-order discrete-time multi-agent systems with switching topologies and time delays[J]. Acta Automatica Sinica, 2019,45(2): 360-373. | |
[26] | 何吕龙, 柏鹏, 梁晓龙 ,等. 多智能体系统离散时间一致性问题中的参数设计[J]. 控制与决策, 2018,33(8): 1455-1460. |
HE L L , BAI P , LIANG X L ,et al. Parameters design for consensus in multi-agent systems with second-order discrete-time dynamics[J]. Control and Decision, 2018,33(8): 1455-1460. | |
[27] | 王静蓉, 李宗刚, 杜亚江 . 基于 LQR 的异构多智能体系统的最优一致性[J]. 信息与控制, 2018,47(4): 468-472. |
WANG J R , LI Z G , DU Y J . LQR-based optimal leader-follower consensus in heterogeneous multi-agent systems[J]. Information and Control, 2018,47(4): 468-472. | |
[28] | 董新民, 丁超, 陈勇 ,等. 完全分布式异构多智能体系统有限时间跟踪[J]. 控制与决策, 2020,35(4): 105-111. |
DONG X M , DING C , CHEN Y ,et al. Fully distributed finite-time tracking of heterogeneous multi-agent systems[J]. Control and Decision Making, 2020,35(4): 105-111. | |
[29] | 李耿, 秦雯, 王婷 ,等. 异质多智能体系统滞后一致性跟踪控制[J]. 计算机应用, 2018,38(12): 37-42. |
LI G , QIN W , WANG T ,et al. Lag consensus tracking control for heterogeneous multi-agent systems[J]. Journal of Computer Applications, 2018,38(12): 37-42. | |
[30] | 张霓, 杜伟, 何熊熊 ,等. 基于中间状态值的多智能体系统安全一致性控制[J]. 控制与决策, 2019,34(3): 567-571. |
ZHANG N , DU W , HE X X ,et al. Secure consensus control of multi-agent systems based on median state strategy[J]. Control and Decision, 2019,34(3): 567-571. | |
[31] | 王巍 . 基于自适应动态规划的多智能体系统一致性方法[D]. 武汉:中国地质大学, 2019. |
WANG W . Unified method of multi-agent system based on adaptive dynamic programming[D]. Wuhan:China University of Geosciences, 2019. | |
[32] | 李健, 沈艳军, 刘允刚 . 线性多智能体系统一致性的自适应动态规划求解方法[J]. 系统科学与数学, 2016,36(7): 1016-1030. |
LI J , SHEN Y J , LIU Y G . Adaptive dynamic programming technique for the consensus of linear multi-agent system[J]. Journal of Systems Science and Mathematical Sciences, 2016,36(7): 1016-1030. | |
[33] | 魏文军, 马羊琴, 李宗刚 . 有限时间内异构多智能体系统的协同输出调节[J]. 控制理论与应用, 2019,36(6): 885-892. |
WEI W J , MA Y Q , LI Z G . Cooperative output regulation of heterogeneous multi-agent system in finite time[J]. Control Theory and Applications, 2019,36(6): 885-892. | |
[34] | SI J , WANG Y T . Online learning control by association and reinforcement[J]. IEEE Transactions on Smart Grid, 2001,12(2): 264-276. |
[35] | WERBOS P J . A menu of designs for reinforcement learning overtime[M]// Neural Networks for Control. Cambridge: MIT Press, 1991: 67-95. |
[36] | DOYA K . Reinforcement learning in continuous-time and space[J]. Neural Computation, 2000,12: 219-245. |
[37] | MODARES H , LEWIS F L . Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning[J]. IEEE Transactions on Automatic Control, 2014,59(11): 3051-3056. |
[38] | MODARES H , LEWIS F L . Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning[J]. Automatica, 2014,50(7): 1780-1792. |
[39] | GADEWADIKAR J , LEWIS F L . Necessary and sufficient conditions for Hstatic output-feedback control[J]. Journal of Guidance Control and Dynamics, 2006,29(4): 915-920. |
[40] | BITTANTI S , COLANERI SP . Lyapunov and Riccati equations:periodic inertia theorems[J]. IEEE Transactions on Automatic Control, 1986,31(7): 659-661. |
[41] | VAMVOUDAKIS K G , LEWIS F L , HUDAS G R . Multi-agent differential graphical games:online adaptive learning solution for synchronization with optimality[J]. Automatica, 2012,48(8): 1598-1611. |
[42] | ABOUHEAF M I , LEWIS F L , VAMVOUDAKIS K G ,et al. Multi-agent discrete-time graphical games and reinforcement learning solutions[J]. IEEE Transactions on Smart Grid, 2014,50(12): 152-161. |
[43] | ZHANG H , ZHANG J , YANG G H ,et al. Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming[J]. IEEE Transactions on Fuzzy Systems, 2015,23(1): 152-163. |
[44] | ZHANG H , YUE D , ZHAO W ,et al. Distributed optimal consensus control for multiagent systems with input delay[J]. IEEE Transactions on Cybernetics, 2018,48(6): 1747-1759. |
[45] | LUO B , WU H N , HUANG T ,et al. Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design[J]. Automatica, 2014,50(12): 3281-3290. |
[46] | HOU Z S , WANG Z . From model-based control to data-driven control:survey,classification and perspective[J]. Information Sciences, 2013,235: 3-35. |
[47] | BIAN T , JIANG Y , JIANG Z P . Decentralized adaptive optimal control of large-scale systems with application to power systems[J]. IEEE Transactions on Industrial Electronics, 2015,62(4): 2439-2447. |
[48] | MOVRIC K H , LEWIS F L . Cooperative optimal control for multi-agent systems on directed graph topologies[J]. IEEE Transactions on Smart Grid, 2014,59(3): 769-774. |
[49] | LI J , MODARES H , CHAI T ,et al. Off-policy reinforcement learning for synchronization in multiagent graphical games[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017,28(10): 2434-2445. |
[50] | ZHANG H , PARK J H , YUE D ,et al. Finite-horizon optimal consensus control for unknown multiagent state-delay systems[J]. IEEE Transactions on Cybernetics, 2020,50(2): 402-413. |
[51] | KAR S , MOURA J M , POOR H V . QD-learning:a collaborative distributed strategy for multi-agent reinforcement learning through consensus innovations[J]. IEEE Transaction son Smart Grid, 2013,61(7): 1848-1862. |
[52] | KAYA M , ALHAJJ R . A novel approach to multiagent reinforcement learning:utilizing OLAP mining in the learning process[J]. IEEE Transactions on Systems,Man,and Cybernetics,Part C:Applications and Reviews, 2005,35(4): 152-161. |
[53] | ABOUHEAF M I , LEWIS F L , VAMVOUDAKIS K G ,et al. Multi-agent discrete-time graphical games and reinforcement learning solutions[J]. Automatica, 2014,50(12): 3038-3053. |
[54] | ABOUHEAF M I , LEWIS F L , MAHMOUD M S ,et al. Discrete-time dynamic graphical games:model-free reinforcement learning solution[J]. IEEE Transactions on Smart Grid, 2015,13(1): 55-69. |
[55] | WANG W , CHEN X , FU H ,et al. Model-free distributed consensus control based on actor-critic framework for discrete-time nonlinear multiagent systems[J]. IEEE Transactions on Systems,Man,and Cybernetics:Systems, 2020,50(11): 4123-4134. |
[56] | ZHANG H , ZHANG J , YANG G H ,et al. Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming[J]. IEEE Transactions on Fuzzy Systems, 2015,23(1): 152-163. |
[57] | ZHANG H , YUE D , ZHAO W ,et al. Distributed optimal consensus control for multiagent systems with input delay[J]. IEEE Transactions on Cybernetics, 2018,48(6): 1747-1759. |
[58] | XUE L , SUN C , YU F . A game theoretical approach for distributed resource allocation with uncertainty[J]. International Journal of Intelligent Computing and Cybernetics, 2017,10(1): 52-67. |
[59] | XUE L , SUN C , WUNSCH D ,et al. An adaptive strategy via reinforcement learning for the prisoner’s dilemma game[J]. IEEE-CAA Journal of Automatica Sinica, 2018,5(1): 301-310. |
[60] | LI N , MARDEN J R . Designing games for distributed optimization[J]. IEEE Journal on Selected Topics in Signal Processing, 2013,7(2): 230-242. |
[61] | VAMVOUDAKIS K G , LEWIS F L , HUDAS G R . Multi-agent differential graphical games:online adaptive learning solution for synchronization with optimality[J]. Automatica, 2012,48(8): 1598-1611. |
[62] | SU L C , ZHU F . Design of a novel omnidirectional stereo vision system[J]. Acta Automatica Sinica, 2017,275: 649-658. |
[63] | ZHANG H G , FENG T , LIANG H ,et al. LQR-based optimal distributed cooperative design for linear discrete-time multi agent systems[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015,28(3): 599-611. |
[64] | MU C X , ZHAO Q , SUN C Y . Q-learning solution for optimal consensus control of discrete-time multi-agent systems using reinforcement learning[J]. IEEE Transactions on Smart Grid, 2019,356(13): 6946-6967. |
[65] | MARDEN J R , ARSLAN G , SHAMMA J S . Cooperative control and potential games[J]. IEEE Transactions on Systems,Man and Cybernetics, 2009,39(6): 1393-1407. |
[66] | JAHAN Z , DIDEBAN A , TATARI F . Heuristic dynamic programming for nonlinear zero-sum dynamic graphical games with unknown dynamics and input constraint[J]. IET Research Journals, 2015: 1-10. |
[67] | ZHANG H , JIANG H , LUO Y ,et al. Data-driven optimal consensus control for discrete-time multi-agent systems with unknown dynamics using reinforcement learning method[J]. IEEE Transactions on Industrial Electronics, 2017,64(5): 4091-4100. |
[68] | DING Z . Consensus output regulation of a class of heterogeneous nonlinear systems[J]. IEEE Transactions on Automatic Control, 2013,58(10): 2648-2653. |
[69] | ZUO S , SONG Y , LEWIS F L ,et al. Adaptive output containment control of heterogeneous multi-agent systems with unknown leaders[J]. Automatica, 2018,92: 235-239. |
[70] | MODARES H , LEWIS F L , KANG W ,et al. Optimal synchronization of heterogeneous nonlinear systems with unknown dynamics[J]. IEEE Transactions on Automatic Control, 2018,63(1): 117-131. |
[71] | MODARES H , NAGESHRAO S P , LOPES G A D ,et al. Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning[J]. Automatica, 2016,71: 334-341. |
[72] | ZUO S , SONG Y D , LEWIS F L ,et al. Optimal robust output containment of unknown heterogeneous multiagent system using off-policy reinforcement learning[J]. IEEE Transactions on Cybernetics, 2018,48(11): 3197-3207. |
[73] | GAO W N , JIANG Y , DAVARI M . Data-driven cooperative output regulation of multi-agent systems via robust adaptive dynamic programming[J]. IEEE Transactions on Circuits and Systems II:Express Briefs, 2019,66(3): 447-451. |
[74] | LUY T N . Distributed cooperative H∞ optimal tracking control of MIMO nonlinear multi-agent systems in strict-feedback form via adaptive dynamic programming[J]. IEEE Transactions on Smart Grid, 2017: 952-968. |
[75] | GAO W N , JIANG Z P , LEWIS F L ,et al. Leader-to-formation stability of multi-agent systems:an adaptive optimal control approach[J]. IEEE Transactions on Automatic Control, 2018,68(10): 3581-3587. |
[76] | JIAO Q , MODARES H , XU S ,et al. Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control[J]. Automatica, 2016,69(1): 24-34. |
[77] | MU C X , ZHAO Q , SUN C Y . Optimal model-free output synchronization of heterogeneous multi-agent systems under switching topologies[J]. IEEE Transactions on Industrial Electronics, 2020,67(12): 10951-10964. |
[1] | Zhou YU, Jing BI, Haitao YUAN. A path planning method for complex naval battle field based on an improved DQN algorithm [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(3): 418-425. |
[2] | Shuai MA, Qiming FU, Jianping CHEN, Fan FENG, You LU, Zhengwei LI, Shunian QIU. HVAC model-free optimal control method based on double-pools DQN [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(3): 426-444. |
[3] | Yuxiang SUN, Yihui PENG, Bin LI, Jiawei ZHOU, Xinlei ZHANG, Xianzhong ZHOU. Overview of intelligent game:enlightenment of game AI to combat deduction [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(2): 157-173. |
[4] | De XU, Fangbo QIN. Research development on automated robotic peg-in-hole assembly [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(2): 200-211. |
[5] | Jiacheng LIU, Xiangwen ZHANG. TD3-based energy management strategy for hybrid energy storage system of electric vehicle [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(2): 277-287. |
[6] | Pu FENG, Wenjun WU, Jie LUO, Xin YU, Yongkai TIAN. Emergence measurement of robot swarm intelligence based on swarm entropy [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(1): 65-74. |
[7] | Han WANG, Yang YU, Yuan JIANG. A cooperative multi-agent reinforcement learning algorithm based on dynamic self-selection parameters sharing [J]. Chinese Journal of Intelligent Science and Technology, 2022, 4(1): 75-83. |
[8] | Lina XIA, Qing LI, Ruizhuo SONG, Zihan WANG, Zhen XU. Synchronization control of unknown heterogeneous multi-agent system via model-free adaptive dynamic programming [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(4): 444-448. |
[9] | Zhiqiang HU. The framework model on internal mechanism of big data intelligent command and control [J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(1): 101-109. |
[10] | Zhaoyang LIU, Chaoxu MU, Changyin SUN. An overview on algorithms and applications of deep reinforcement learning [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 314-326. |
[11] | Qing-Shan JIA, Jingxian TANG, Junjie WU, Xiao HU, Yiting LIN, Heng XIA. Reinforcement learning for green and reliable data center [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 341-347. |
[12] | Tao LI, Qinglai WEI. Intelligent heating temperature control system based on deep reinforcement learning [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 348-353. |
[13] | Rizhong WANG, Huiping LI, Di CUI, Demin XU. Depth control of autonomous underwater vehicle using deep reinforcement learning [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 354-360. |
[14] | Huiqiao FU, Kaiqiang TANG, Guizhou DENG, Xinpeng WANG, Chunlin CHEN. Motion planning for hexapod robot using deep reinforcement learning [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 361-371. |
[15] | Yingying LIU, Zhanshan WANG. Output synchronization of heterogeneous multi-agent system:a reinforcement learning approach based on data [J]. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 394-400. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
|