通信学报 ›› 2023, Vol. 44 ›› Issue (6): 183-197.doi: 10.11959/j.issn.1000-436x.2023122

• 学术论文 • 上一篇    下一篇

GenFedRL:面向深度强化学习智能体的通用联邦强化学习框架

金彪1,2, 李逸康1, 姚志强1,2, 陈瑜霖1, 熊金波1,2   

  1. 1 福建师范大学计算机与网络空间安全学院,福建 福州 350007
    2 大数据分析与应用福建省高校工程研究中心,福建 福州 350007
  • 修回日期:2023-03-29 出版日期:2023-06-25 发布日期:2023-06-01
  • 作者简介:金彪(1985- ),男,安徽六安人,博士,福建师范大学副教授、硕士生导师,主要研究方向为信息安全、隐私保护等
    李逸康(1998- ),男,广东广州人,福建师范大学硕士生,主要研究方向为联邦学习与深度强化学习的交叉应用
    姚志强(1967- ),男,福建莆田人,博士,福建师范大学教授、博士生导师,主要研究方向为信息安全、隐私保护等
    陈瑜霖(1996- ),男,福建泉州人,福建师范大学硕士生,主要研究方向为深度学习技术
    熊金波(1981- ),男,湖南益阳人,博士,福建师范大学教授、博士生导师,主要研究方向为安全深度学习、移动智群感知、隐私保护
  • 基金资助:
    国家自然科学基金资助项目(62272103)

GenFedRL: a general federated reinforcement learning framework for deep reinforcement learning agents

Biao JIN1,2, Yikang LI1, Zhiqiang YAO1,2, Yulin CHEN1, Jinbo XIONG1,2   

  1. 1 College of Computer and Cyber Security, Fujian Normal University, Fuzhou 350007, China
    2 Fujian Provincial Colleges and University Engineering Research Center of Big Data Analysis and Application, Fuzhou 350007, China
  • Revised:2023-03-29 Online:2023-06-25 Published:2023-06-01
  • Supported by:
    The National Natural Science Foundation of China(62272103)

摘要:

针对智能物联网中,搭载深度强化学习智能体的智能设备缺乏有效安全数据共享机制的问题,提出一种面向深度强化学习智能体的通用联邦强化学习(GenFedRL)框架。GenFedRL不需要共享深度强化学习智能体的本地私有数据,而通过模型共享技术实现共同训练,在保护各智能体私有数据隐私的同时,有效地利用其数据资源和计算资源。为应对现实通信环境的复杂性与满足加速训练的需要,为GenFedRL设计了基于同步并行的模型共享机制。结合常见深度强化学习算法自身的模型结构特点,基于 FedAvg 算法设计了适用于单网络结构与多网络结构的通用联邦强化学习算法,进而实现了具有同种网络结构的智能体间的模型共享机制,更好地保护各类智能体的私有数据。仿真实验表明,即使在大部分数据节点无法参与训练的恶劣通信环境下,常见深度强化学习算法智能体在所提框架上仍表现出良好的性能。

关键词: 智能物联网, 联邦学习, 联邦强化学习, 深度强化学习

Abstract:

To solve the problem that intelligent devices equipped with deep reinforcement learning agents lack effective security data sharing mechanisms in the intelligent Internet of things, a general federated reinforcement learning (GenFedRL) framework was proposed for deep reinforcement learning agents.The joint training through model-sharing technology was realized by GenFedRL without the need to share the local private data of deep reinforcement learning agents.Each agent device’s data and computing resources could be effectively used without disclosing the privacy of its private data.To cope with the complexity of the real communication environment and meet the need to accelerate the training speed, a model-sharing mechanism based on synchronization and parallel was designed for GenFedRL.Combined with the model structure characteristics of common deep reinforcement learning algorithms, general federated reinforcement learning algorithm suitable for single network structure and multi-network structure was designed based on the FedAvg algorithm, respectively.Then, the model sharing mechanism among agents with the same network structure was implemented to protect the private data of various agents better.Simulation experiments show that common deep reinforcement learning algorithms still perform well in GenFedRL even in the harsh communication environment where most data nodes cannot participate in training.

Key words: intelligent Internet of things, federal learning, federal reinforcement learning, deep reinforcement learning

中图分类号: 

No Suggested Reading articles found!