Chinese Journal of Intelligent Science and Technology ›› 2022, Vol. 4 ›› Issue (1): 75-83.doi: 10.11959/j.issn.2096-6652.202214

• Special Topic: Crowd Intelligence • Previous Articles     Next Articles

A cooperative multi-agent reinforcement learning algorithm based on dynamic self-selection parameters sharing

Han WANG, Yang YU, Yuan JIANG   

  1. State Key Laboratory for Novel Software Technology at Nanjing University, Nanjing 210023, China
  • Revised:2022-01-14 Online:2022-03-15 Published:2022-03-01
  • Supported by:
    The National Natural Science Foundation of China(61876077)

Abstract:

In multi-agent reinforcement learning, parameter sharing can effectively alleviate the inefficiency of learning caused by non-stationarity.However, maintaining the same policy forall agents during learning may have detrimental effects.To solve this problem, a new approach was introduced to give agents the ability to automatically identify agents that may benefit from parameter sharing and dynamically share parameters them during learning.Specifically, agents needed to encode empirical trajectories as implicit information that can represent their potential intentions, and selected peers to share parameters by comparing their intentions.Experiments show that the proposed method not only can improve the efficiency of parameter sharing, but also ensure the quality of policy learning in multi-agent system.

Key words: multi-agent system, reinforcement learning, parameter sharing

CLC Number: 

No Suggested Reading articles found!