Big Data Research ›› 2021, Vol. 7 ›› Issue (5): 82-97.doi: 10.11959/j.issn.2096-0271.2021050

• TOPIC: BIG DATA PROCESSING SYSTEM IN CHINA’S HOMEMADE COMPUTING ENVIRONMENT • Previous Articles     Next Articles

A wide-area collaborative scheduling system oriented to big data processing applications

Chenhao ZHANG1,2, Limin XIAO1,2, Guangjun QIN3, Yao SONG1,2, Shixuan JIANG1,2, Jiye WANG4   

  1. 1 State Key Laboratory of Software Development Environment, Beijing 100191, China
    2 School of Computer Science and Engineering, Beihang University, Beijing 100191, China
    3 Smart City College, Beijing Union University, Beijing 100101, China
    4 Big Data Center, State Grid Corporation of China, Beijing 100031, China
  • Online:2021-09-15 Published:2021-09-01
  • Supported by:
    The National Key Research and Development Program of China(2017YFB1010000)

Abstract:

Based on the high-performance computing global virtual data space system, a wide-area collaborative scheduling system for big data processing applications was designed and implemented.This system can address the issue of how big data processing applications unified use wide-area storage and computing resources.And it can collaborative schedule of application data and computing tasks based on the computing characteristics of the application and data layout through collaborative scheduling, load balancing scheduling, data locality scheduling strategies.By unified scheduling of application data and computing tasks in the wide-area environment, it can coordinate the utilization of wide-area computing and storage resources, and effectively improve the running performance of big data processing applications.The actual test results in the national high-performance computing environment show that the scheduling method proposed can support big data processing applications effectively, and the running efficiency of typical applications such as wide-area target collaborative recognition and molecular docking can be increased by 3~4 times.

Key words: wide-area collaborative scheduling, big data processing application, global virtual data space, high-performance computing environment

CLC Number: 

No Suggested Reading articles found!