博弈5.0：基于平行系统和机器博弈的社会认知平行博弈

doi:10.11959/j.issn.2096-6652.202151

智能科学与技术学报 ›› 2021, Vol. 3 ›› Issue (4): 507-520.doi: 10.11959/j.issn.2096-6652.202151

博弈5.0：基于平行系统和机器博弈的社会认知平行博弈

李亚玲¹, 杨林瑶², 葛俊¹, 覃缘琪¹, 王晓²^,³

¹ 之江实验室智能社会治理研究中心，浙江杭州 311100
² 中国科学院自动化研究所复杂系统管理与控制国家重点实验室，北京 100190
³ 青岛智能产业技术研究院，山东青岛 256200

修回日期:2021-11-22 出版日期:2021-12-15 发布日期:2021-12-01
作者简介:李亚玲（1991- ），女，博士，之江实验室智能社会治理研究中心助理研究员，主要研究方向为人工智能技术的应用
杨林瑶（1995- ），男，中国科学院自动化研究所复杂系统管理与控制国家重点实验室博士生，主要研究方向为图表示学习、知识融合
葛俊（1989- ），男，之江实验室智能社会治理研究中心工程师，主要研究方向为多智能体协同控制、强化学习等
覃缘琪（1992- ），女，之江实验室智能社会治理研究中心工程师，主要研究方向为智能交通控制、智能社会治理等
王晓（1988- ），女，博士，中国科学院自动化研究所复杂系统管理与控制国家重点实验室副研究员，主要研究方向为社会交通、动态网群组织、人工智能和社会网络分析
基金资助:
之江实验室中心自设科研项目(115011-212004);浙江省科学技术情报学会科研项目(2021qbxh009)

Game 5.0: social cognitionparallel game based on the parallel systems and machine game

Yaling LI¹, Linyao YANG², Jun GE¹, Yuanqi QIN¹, Xiao WANG²^,³

¹ Research Center for Intelligent Society and Governance, Zhejiang Laboratory, Hangzhou 311100, China
² The State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
³ Qingdao Academy of Intelligent Industries, Qingdao 256200, China

Revised:2021-11-22 Online:2021-12-15 Published:2021-12-01
Supported by:
Center-initiated Research Project of Zhejiang Laboratory(115011-212004);Research Project of Zhejiang Society for Scientific and Technical Information(2021qbxh009)

摘要/Abstract

摘要：

机器博弈是一种机器模仿人的方式进行思考和决策的人工智能，是当前人工智能领域非常具有挑战性的研究方向，其研究现状很大程度上代表了人工智能的发展水平。总结了机器博弈的基本概念、研究现状和典型应用，探讨了机器博弈研究面临的挑战和发展趋势，分析了与机器博弈相关的研究在社会治理上可能的应用，进而提出基于平行系统和机器博弈的社会认知平行博弈方法，以期为与社会治理相关的研究提供新的思路。

关键词: 机器博弈, 人工智能, 平行系统, 社会治理

Abstract:

Machine game is a kind of artificial intelligence in which machines imitate the way people think and make decisions.It is currently the most challenging research direction in the field of artificial intelligence, whose research status largely represents the development level of artificial intelligence.The basic concepts, research status and typical applications of machine game were summarized, its challenges and development trends were discussed, and the possible applications of machine game on the social governance were analyzed.Then, a social cognitive parallel game method based on the parallel systems and the machine game were proposed, aiming at providing new ideas for the research of social governance.

Key words: machine game, artificial intelligence, parallel system, social governance

中图分类号:

TP301

李亚玲, 杨林瑶, 葛俊, 等. 博弈5.0：基于平行系统和机器博弈的社会认知平行博弈[J]. 智能科学与技术学报, 2021, 3(4): 507-520.

Yaling LI, Linyao YANG, Jun GE, et al. Game 5.0: social cognitionparallel game based on the parallel systems and machine game[J]. Chinese Journal of Intelligent Science and Technology, 2021, 3(4): 507-520.

图/表 7

图1

表1

图2

图3

图4

图5

图6

参考文献 60

[1]	徐心和, 邓志立, 王骄 ,等. 机器博弈研究面临的各种挑战[J]. 智能系统学报, 2008,3(4): 288-293.
	XU X H , DENG Z L , WANG J ,et al. Challenging issues facing computer game research[J]. CAAI Transactions on Intelligent Systems, 2008,3(4): 288-293.
[2]	张小川, 唐艳, 梁宁宁 . 采用时间差分算法的九路围棋机器博弈系统[J]. 智能系统学报, 2012,7(3): 278-282.
	ZHANG X C , TANG Y , LIANG N N . A 9 × 9 Go computer game system using temporal difference[J]. CAAI Transactions on Intelligent Systems, 2012,7(3): 278-282.
[3]	张小川, 王宛宛, 彭丽蓉 . 一种军棋机器博弈的多棋子协同博弈方法[J]. 智能系统学报, 2020,15(2): 399-404.
	ZHANG X C , WANG W W , PENG L R . A multi-chess collaborative game method for military chess game machine[J]. CAAI Transactions on Intelligent Systems, 2020,15(2): 399-404.
[4]	沈宇, 韩金朋, 李灵犀 ,等. 游戏智能中的 AI:从多角色博弈到平行博弈[J]. 智能科学与技术学报, 2020,2(3): 205-213.
	SHEN Y , HAN J P , LI L X ,et al. AI in game intelligence—from multi-role game to parallel game[J]. Chinese Journal of Intelligent Science and Technology, 2020,2(3): 205-213.
[5]	SHANNON C E . Programming a computer for playing chess[J]. The London,Edinburgh,and Dublin Philosophical Magazine and Journal of Science, 1950,41(314): 256-275.
[6]	BERNSTEIN A , ARBUCKLE T , DE V ROBERTS M ,et al. A chess playing program for the IBM704[C]// Proceedings of the Western Joint Computer Conference:Contrasts in Computers.[S.l.:s.n.], 1958: 157-159.
[7]	SILVER D , HUANG A , MADDISON C J ,et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016,529(7587): 484-489.
[8]	唐平中, 朱军, 俞扬 ,等. 动态不确定条件下的人工智能[J]. 中国科学基金, 2018,32(3): 266-270.
	TANG P Z , ZHU J , YU Y ,et al. AI under dynamics and uncertainties[J]. Bulletin of National Natural Science Foundation of China, 2018,32(3): 266-270.
[9]	李翔, 姜晓红, 陈英芝 ,等. 基于手牌预测的多人无限注德州扑克博弈方法[J]. 计算机学报, 2018,41(1): 47-64.
	LI X , JIANG X H , CHEN Y Z ,et al. Game in multiplayer no-limit texas Hold’em based on hands prediction[J]. Chinese Journal of Computers, 2018,41(1): 47-64.
[10]	文习明 . 人工智能时代的社会治理:机遇、挑战与总体框架设计[J]. 岭南学刊, 2019(3): 83-88.
	WEN X M . Social governance in the age of artificial intelligence:opportunities,challenges and overall framework design[J]. Lingnan Journal, 2019(3): 83-88.
[11]	赵宜萱, 赵曙明, 栾佳锐 . 基于人工智能的人力资源管理:理论模型与研究展望[J]. 南京社会科学, 2020(2): 36-43.
	ZHAO Y X , ZHAO S M , LUAN J R . Human resource management based on artificial intelligence:a conceptional model and future research[J]. Nanjing Journal of Social Sciences, 2020(2): 36-43.
[12]	朱燕丹, 靖鸣 . 传播与权力的博弈:新媒体视域下社会治理的问题与对策[J]. 江西师范大学学报(哲学社会科学版), 2016,49(4): 50-56.
	ZHU Y D , JING M . The game of communication and power:the research about the problems and the countermeasures of social governance from the perspective of new- media[J]. Journal of Jiangxi Normal University (Philosophy and Social Sciences Edition), 2016,49(4): 50-56.
[13]	易轩宇 . 社会协同治理中社会组织的博弈评价与优化对策[J]. 甘肃社会科学, 2014(6): 190-194.
	YI X Y . Game evaluation and optimal countermeasures of social organizations in social cooperative governance[J]. Gansu Social Sciences, 2014(6): 190-194.
[14]	王飞跃 . 人工社会、计算实验、平行系统:关于复杂社会经济系统计算研究的讨论[J]. 复杂系统与复杂性科学, 2004,1(4): 25-35.
	WANG F Y . Artificial societies,computational experiments,and parallel systems:a discussion on computational theory of complex social-economic systems[J]. Complex Systems and Complexity Science, 2004,1(4): 25-35.
[15]	杨林瑶, 韩双双, 王晓 ,等. 网络系统实验平台:发展现状及展望[J]. 自动化学报, 2019,45(9): 1637-1654.
	YANG L Y , HAN S S , WANG X ,et al. Computational experiment platforms for networks:the state of the art and prospect[J]. Acta Automatica Sinica, 2019,45(9): 1637-1654.
[16]	王亚杰, 邱虹坤, 吴燕燕 ,等. 计算机博弈的研究与发展[J]. 智能系统学报, 2016,11(6): 788-798.
	WANG Y J , QIU H K , WU Y Y ,et al. Research and development of computer games[J]. CAAI Transactions on Intelligent Systems, 2016,11(6): 788-798.
[17]	KORF R E . Depth-first iterative-deepening:an optimal admissible tree search[J]. Artificial Intelligence, 1985,27(1): 97-109.
[18]	ROIZEN I , PEARL J . A minimax algorithm better than Alpha-Beta? Yes and No[J]. Artificial Intelligence, 1983,21(1/2): 199-220.
[19]	RUMELHART D E , HINTON G E , WILLIAMS R J . Learning representations by back-propagating errors[J]. Nature, 1986,323(6088): 533-536.
[20]	徐心和, 王骄 . 中国象棋计算机博弈关键技术分析[J]. 小型微型计算机系统, 2006,27(6): 961-969.
	XU X H , WANG J . Key technologies analysis of Chinese chess computer game[J]. Journal of Chinese Computer Systems, 2006,27(6): 961-969.
[21]	BROWN N , SANDHOLM T . Superhuman AI for multiplayer poker[J]. Science, 2019,365(6456): 885-890.
[22]	ARULKUMARAN K , CULLY A , TOGELIUS J . AlphaStar:an evolutionary computation perspective[C]// Proceedings of the Genetic and Evolutionary Computation Conference Companion.[S.l.:s.n.], 2019: 314-315.
[23]	ETESSAMI K , LOCHBIHLER A . The computational complexity of evolutionarily stable strategies[J]. International Journal of Game Theory, 2008,37(1): 93-113.
[24]	BROWNE C B , POWLEY E , WHITEHOUSE D ,et al. A survey of Monte Carlo tree search methods[J]. IEEE Transactions on Computational Intelligence and AI in Games, 2012,4(1): 1-43.
[25]	MORAV?íK M , SCHMID M , BURCH N ,et al. DeepStack:expert-level artificial intelligence in heads-up no-limit poker[J]. Science, 2017,356(6337): 508-513.
[26]	王军, 曹雷, 陈希亮 ,等. 多智能体博弈强化学习研究综述[J]. 计算机工程与应用, 2021,57(21): 1-13.
	WANG J , CAO L , CHEN X L ,et al. Overview on reinforcement learning of multi-agent game[J]. Computer Engineering and Applications, 2021,57(21): 1-13.
[27]	WANG X F . Reinforcement learning to play an optimal Nash equilibrium in team Markov games[J]. Advances in neural information processing systems, 2002,15: 1603-1610.
[28]	LESLIE D S , COLLINS E J . Generalised weakened fictitious play[J]. Games and Economic Behavior, 2006,56(2): 285-298.
[29]	RYU H , SHIN H , PARK J . Multi-agent actor-critic with hierarchical graph attention network[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020,34(5): 7236-7243.
[30]	曾隽芳, 牟佳, 刘禹 . 多智能体群智博弈策略轻量化问题[J]. 指挥与控制学报, 2020,6(4): 381-387.
	ZENG J F , MOU J , LIU Y . Lightweight issues of swarm intelligence based multi-agent game strategy[J]. Journal of Command and Control, 2020,6(4): 381-387.
[31]	蒋胤傑, 况琨, 吴飞 . 大数据智能:从数据拟合最优解到博弈对抗均衡解[J]. 智能系统学报, 2020,15(1): 175-182.
	JIANG Y J , KUANG K , WU F . Big data intelligence:from the optimal solution of data fitting to the equilibrium solution of game theory[J]. CAAI Transactions on Intelligent Systems, 2020,15(1): 175-182.
[32]	张宏达, 李德才, 何玉庆 . 人工智能与“星际争霸”:多智能体博弈研究新进展[J]. 无人系统技术, 2019,2(1): 5-16.
	ZHANG H D , LI D C , HE Y Q . Artificial intelligence and StarCraft:new progress in multiagent game research[J]. Unmanned Systems Technology, 2019,2(1): 5-16.
[33]	焦尚彬, 刘丁 . 博弈树置换表启发式算法研究[J]. 计算机工程与应用, 2010,46(6): 42-45.
	JIAO S B , LIU D . Research on translation table heuristic algorithm[J]. Computer Engineering and Applications, 2010,46(6): 42-45.
[34]	SCHAEFFER J . The history heuristic and Alpha-Beta search enhancements in practice[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1989,11(11): 1203-1212.
[35]	SAKUTA M , HASHIMOTO T , NAGASHIMA J ,et al. Application of the killer-tree heuristic and the lambda-search method to lines of action[J]. Information Sciences, 2003,154(3/4): 141-155.
[36]	CHASLOT G M J B , WINANDS M H M , HERIK H J V D ,et al. Progressive strategies for Monte Carlo tree search[J]. New Mathematics and Natural Computation, 2008,4(3): 343-357.
[37]	季辉, 丁泽军 . 双人博弈问题中的蒙特卡洛树搜索算法的改进[J]. 计算机科学, 2018,45(1): 140-143.
	JI H , DING Z J . Improvement of Monte Carlo tree search algorithm in two-person game problem[J]. Computer Science, 2018,45(1): 140-143.
[38]	SCHMIDHUBER J . Deep learning in neural networks:an overview[J]. Neural Networks, 2015,61: 85-117.
[39]	MNIH V , KAVUKCUOGLU K , SILVER D ,et al. Human-level control through deep reinforcement learning[J]. Nature, 2015,518(7540): 529-533.
[40]	WU Y , TIAN Y . Training agent for first-person shooter game with actor-critic curriculum learning[C]// Proceedings of the International Conference on Learning Representations (ICLR). New York:ACM Press, 2017.
[41]	PENG P , WEN Y , YANG Y D ,et al. Multiagent bidirectionally-coordinated nets:emergence of human-level coordination in learning to play StarCraft combat games[J]. arXiv preprint,2017,arXiv:1703.10069.
[42]	唐振韬, 邵坤, 赵冬斌 ,等. 深度强化学习进展:从 AlphaGo 到AlphaGo Zero[J]. 控制理论与应用, 2017,34(12): 1529-1546.
	TANG Z T , SHAO K , ZHAO D B ,et al. Recent progress of deep reinforcement learning:from AlphaGo to AlphaGo Zero[J]. Control Theory ＆ Applications, 2017,34(12): 1529-1546.
[43]	王晓, 韩双双, 杨林瑶 ,等. 基于ACP的动态网民群体运动组织建模与计算实验研究[J]. 自动化学报, 2020,46(4): 653-669.
	WANG X , HAN S S , YANG L Y ,et al. The research on ACP-based modeling and computational experiment for cyber movement organizations[J]. Acta Automatica Sinica, 2020,46(4): 653-669.
[44]	李强, 阳东升, 孙江生 ,等. “社会认知战”:时代背景、概念机理及引领性技术[J]. 指挥与控制学报, 2021,7(2): 97-106.
	LI Q , YANG D S , SUN J S ,et al. Social cognition operations:backgrounds,concepts,mechanisms and leading technologies[J]. Journal of Command and Control, 2021,7(2): 97-106.
[45]	邓建玲, 王飞跃, 陈耀斌 ,等. 从工业 4.0 到能源 5.0:智能能源系统的概念、内涵及体系框架[J]. 自动化学报, 2015,41(12): 2003-2016.
	DENG J L , WANG F Y , CHEN Y B ,et al. From industries 4.0 to energy 5.0:concept and framework of intelligent energy systems[J]. Acta Automatica Sinica, 2015,41(12): 2003-2016.
[46]	杨林瑶, 陈思远, 王晓 ,等. 数字孪生与平行系统:发展现状、对比及展望[J]. 自动化学报, 2019,45(11): 2001-2031.
	YANG L Y , CHEN S Y , WANG X ,et al. Digital twins and parallel systems:state of the art,comparisons and prospect[J]. Acta Automatica Sinica, 2019,45(11): 2001-2031.
[47]	王飞跃, 王艳芬, 陈薏竹 ,等. 联邦生态:从联邦数据到联邦智能[J]. 智能科学与技术学报, 2020,2(4): 305-311.
	WANG F Y , WANG Y F , CHEN Y Z ,et al. Federated ecology:from federated data to federated intelligence[J]. Chinese Journal of Intelligent Science and Technology, 2020,2(4): 305-311.
[48]	陈龙, 王晓, 杨健健 ,等. 平行矿山:从数字孪生到矿山智能[J]. 自动化学报, 2021,47(7): 1633-1645.
	CHEN L , WANG X , YANG J J ,et al. Parallel mining operating systems:from digital twins to mining intelligence[J]. Acta Automatica Sinica, 2021,47(7): 1633-1645.
[49]	王春法, 王飞跃, 鲁越 ,等. 平行博物馆:新时代博物馆运营的智能管理与控制[J]. 智能科学与技术学报, 2021,3(2): 125-136.
	WANG C F , WANG F Y , LU Y ,et al. Parallel museums:intelligent management and control of museum operations in the new era[J]. Chinese Journal of Intelligent Science and Technology, 2021,3(2): 125-136.
[50]	沈宇, 王晓, 韩双双 ,等. 代理技术 Agent 在智能车辆与驾驶中的应用现状[J]. 指挥与控制学报, 2019,5(2): 87-98.
	SHEN Y , WANG X , HAN S S ,et al. Agent-based technology in intelligent vehicles and driving:state-of-the-art and prospect[J]. Journal of Command and Control, 2019,5(2): 87-98.
[51]	DUARTE F F , LAU N , PEREIRA A ,et al. A survey of planning and learning in games[J]. Applied Sciences, 2020,10(13): 4529.
[52]	LASKEY M , STASZAK S , HSIEH W Y S ,et al. SHIV:reducing supervisor burden in DAgger using support vectors for efficient learning from demonstrations in high dimensional state spaces[C]// Proceedings of 2016 IEEE International Conference on Robotics and Automation (ICRA). Piscataway:IEEE Press, 2016: 462-469.
[53]	DALEY D J , KENDALL D G . Epidemics and rumours[J]. Nature, 1964,204(4963): 1118.
[54]	马知恩, 周义仓, 王稳地 ,等. 传染病动力学的数学建模与研究[M]. 北京: 科学出版社, 2004.
	MA Z E , ZHOU Y C , WANG W D ,et al. Mathematical modeling and research on the dynamics of infectious diseases[M]. Beijing: Science Press, 2004.
[55]	ANDERSON R M , JACKSON H C , MAY R M ,et al. Population dynamics of fox rabies in Europe[J]. Nature, 1981,289(5800): 765-771.
[56]	WITBOOI P J . Stability of an SEIR epidemic model with independent stochastic perturbations[J]. Physica A:Statistical Mechanics and Its Applications, 2013,392(20): 4928-4936.
[57]	郭东伟, 乌云娜, 邹蕴 ,等. 基于非理性博弈的舆情传播仿真建模研究[J]. 自动化学报, 2014,40(8): 1721-1732.
	GUO D W , WU Y N , ZOU Y ,et al. Simulation and modeling of non-rational game based public opinion spread[J]. Acta Automatica Sinica, 2014,40(8): 1721-1732.
[58]	阳东升, 张维明 . PREA环及其平行智能[J]. 指挥与控制学报, 2019,5(4): 274-281.
	YANG D S , ZHANG W M . PREA loop and its parallel intelligence[J]. Journal of Command and Control, 2019,5(4): 274-281.
[59]	HAN J P , WANG F Y , LYU Y S ,et al. Efficient deployment of patrols to catch arsonists[C]// Proceedings of 2018 Chinese Automation Congress (CAC). Piscataway:IEEE Press, 2018: 792-797.
[60]	陈虹宇, 艾红, 王晓 ,等. 社会交通中的社会信号分析与感知[J]. 自动化学报, 2021,47(6): 1256-1272.
	CHEN H Y , AI H , WANG X ,et al. Analysis and perception of social signals in social transportation[J]. Acta Automatica Sinica, 2021,47(6): 1256-1272.

游戏名称	棋盘大小	状态空间复杂度	博弈树复杂度
井字棋	3×3	10³	10⁵
国际跳棋	8×8	10²¹	10³¹
国际象棋	8×8	10⁴⁶	10¹²³
五子棋	15×15	10¹⁰⁵	10⁷⁰
围棋	19×19	10¹⁷²	10³⁶⁰

博弈5.0：基于平行系统和机器博弈的社会认知平行博弈

Game 5.0: social cognitionparallel game based on the parallel systems and machine game

在线阅读

PDF下载

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 60

相关文章 15

Metrics

推荐阅读 0

[1]	王飞跃, 缪青海, 张军平, 郑文博, 丁文文. 探讨AI for Science的影响与意义：现状与展望[J]. 智能科学与技术学报, 2023, 5(1): 1-6.
[2]	缪青海, 吕宜生. 元宇宙下的平行交通系统[J]. 智能科学与技术学报, 2023, 5(1): 32-40.
[3]	康孟珍, 孙贺全, 王秀娟, 王飞跃. 系统农业：结合农业社会经济属性的建模和控制[J]. 智能科学与技术学报, 2023, 5(1): 41-50.
[4]	王晓, 杨林瑶, 胡斌, 侯家琛. 平行推理：一种基于ACP方法的虚实互动的知识协同框架[J]. 智能科学与技术学报, 2023, 5(1): 69-82.
[5]	张军欢, 朱正一, 蔡可玮. 人工智能与量化交易的课程建设[J]. 智能科学与技术学报, 2023, 5(1): 104-112.
[6]	田永林, 陈苑文, 杨静, 王雨桐, 王晓, 缪青海, 王子然, 王飞跃. 元宇宙与平行系统：发展现状、对比及展望[J]. 智能科学与技术学报, 2023, 5(1): 121-132.
[7]	卢经纬, 程相, 王飞跃. 求解微分方程的人工智能与深度学习方法：现状及展望[J]. 智能科学与技术学报, 2022, 4(4): 461-476.
[8]	蔡莹皓, 杨华, 安璇, 王文硕, 杜沂东, 张嘉韬, 王志刚. 神经符号学及其应用研究[J]. 智能科学与技术学报, 2022, 4(4): 560-570.
[9]	康孟珍, 邱文忠, 陈自富, 王猛, 许沙沙, 王秀娟, 倪爱东, 蒋玉洁, 陈世超, DEREFFYE Philippe, 王飞跃. 平行圆明园：从数字孪生园林到元宇宙智慧遗址公园[J]. 智能科学与技术学报, 2022, 4(3): 301-307.
[10]	武强, 季雪庭, 吕琳媛. 元宇宙中的人工智能技术与应用[J]. 智能科学与技术学报, 2022, 4(3): 324-334.
[11]	赖文柱, 陈德旺, 何振峰, 邓新国, GIUSEPPE CARLO Marano. 地铁列车驾驶技术发展综述：从人工驾驶到智能无人驾驶[J]. 智能科学与技术学报, 2022, 4(3): 335-343.
[12]	郭超, 鲁越, 王晓, 易达, 王虓, 王飞跃. 人机物CPSS智能融合的平行创作架构与关键技术研究[J]. 智能科学与技术学报, 2022, 4(3): 344-354.
[13]	李小双, 王晓, 杨林瑶, 田永林, 王雨桐, 张俊, 王飞跃. 元电网MetaGrid：基于平行电网的新一代智能电网的体系与架构[J]. 智能科学与技术学报, 2021, 3(4): 387-398.
[14]	胡东伟, 冯晓璐. 大脑建模的理论框架及热点问题[J]. 智能科学与技术学报, 2021, 3(4): 412-434.
[15]	王飞跃, 蒋怀光. 平行电池：智能生态化电池技术与服务体系的框架和流程[J]. 智能科学与技术学报, 2021, 3(4): 521-531.