智能科学与技术学报 ›› 2022, Vol. 4 ›› Issue (1): 65-74.doi: 10.11959/j.issn.2096-6652.202213

• 专题:群体智能 • 上一篇    下一篇

基于群体熵的机器人群体智能汇聚度量

冯埔1, 吴文峻2, 罗杰1, 于鑫1, 田雍恺1   

  1. 1 北京航空航天大学计算机学院,北京 100191
    2 北京航空航天大学人工智能研究院,北京 100191
  • 修回日期:2022-01-19 出版日期:2022-03-15 发布日期:2022-03-01
  • 作者简介:冯埔(1995– ),男,北京航空航天大学计算机学院博士生,主要研究方向为多智能体强化学习
    吴文峻(1973– ),男,博士,北京航空航天大学人工智能研究院教授、博士生导师,主要研究方向为群体智能、智能微服务、认知建模等
    罗杰(1981– ),男,博士,北京航空航天大学计算机学院副教授,主要研究方向为知识图谱构、群体智能等
    于鑫(1994– ),男,北京航空航天大学计算机学院博士生,主要研究方向为群体智能、强化学习
    田雍恺(1997– ),男,北京航空航天大学计算机学院硕士生,主要研究方向为多智能体强化学习
  • 基金资助:
    科技创新2030—“新一代人工智能”重大项目(2018AAA0102300)

Emergence measurement of robot swarm intelligence based on swarm entropy

Pu FENG1, Wenjun WU2, Jie LUO1, Xin YU1, Yongkai TIAN1   

  1. 1 School of Computer Science and Engineering, Beihang University, Beijing 100191, China
    2 Institute of Artificial Intelligence, Beihang University, Beijing 100191, China
  • Revised:2022-01-19 Online:2022-03-15 Published:2022-03-01
  • Supported by:
    Science and Technology Innovation 2030 — “New Generation Artificial Intelligence” Major Project(2018AAA0102300)

摘要:

群体行为往往能产生远超个体行为的价值和复杂度。为了在个体智能的基础上更有效地衍生出群体智能,需要基于群体熵来科学地衡量群体智能水平,并以群体熵为引导目标,推动群体智能的增强和演进。针对这个重要的科学问题,以无人小车群体为研究对象,提出基于参数共享和群体策略熵的多智能体soft Q learning算法,通过共享智能体的观测信息,并结合最大熵强化学习方法,实现探索型任务中群体策略的持续学习更新。同时,通过将群体熵定义为度量工具,刻画群体学习中熵变化模式,实现对群智汇聚过程的定量分析。

关键词: 群体熵, 群体智能, 深度强化学习

Abstract:

Swarm behavior can often produce value and complexity far beyond individual behavior.In order to more effectively derive swarm intelligence on the basis of individual intelligence, it is necessary to scientifically measure the level of swarm intelligence based on swarm entropy, and use swarm entropy as the guiding goal to promote the enhancement and evolution of swarm intelligence.Aiming at this important scientific problem, the unmanned car group as the research object was taken and a multi-agent soft Q learning method based on parameter sharing and group strategy entropy was proposed.Which by sharing the observation information of the agent, combined with the maximum entropy reinforcement learning method, to achieve continuous learning and updating of swarm strategies in exploratory tasks.At the same time, by defining swarm entropy as a measurement tool, characterizing the entropy change pattern in swarm learning, realizing the quantitative analysis of the gathering process of swarm intelligence.

Key words: swarm entropy, swarm intelligence, deep reinforcement learning

中图分类号: 

No Suggested Reading articles found!