Multi-agent systems are a cutting-edge research concept in the field of distributed artificial intelligence. Traditional multi-agent reinforcement learning methods mainly focus on topics such as group behavior emergence, multi-agent cooperation and coordination, communication and communication between agents, opponent modeling and prediction. However, they still face challenges such as observable environment, non-stationary opponent strategies, high dimensionality of decision space, and difficulty in understanding credit allocation. How to design multi-agent reinforcement learning methods that meet the large number and scale of intelligent agents and adapt to multiple different application scenarios is a cutting-edge topic in this field. This article first outlined the relevant research progress of multi-agent reinforcement learning. Secondly, a comprehensive overview and induction of multi-agent learning methods with multiple types and paradigms were conducted from the perspectives of scalability and population adaptation. Four major categories of scalable learning methods were systematically sorted out, including set permutation invariance, attention, graph and network theory, and mean field theory. There were four major categories of population adaptive reinforcement learning methods: transfer learning, course learning, meta learning, and meta game, and typical application scenarios were provided. Finally, the frontier research directions were prospected from five aspects: benchmark platform development, two-layer optimization architecture, adversarial strategy learning, human-machine collaborative value alignment and adaptive game decision-making loop, providing reference for the research on relevant frontier key issues of multi-agent reinforcement learning in multimodal environments.