数据流计算环境下的集群资源管理技术

doi:10.11959/j.issn.2096-0271.2020026

Abstract

Abstract:

The development of cluster-based high-performance computing has undergone three stages of evolution.With the widespread use of dataflow programming models such as Spark and Flink in the field of big data computing,how to ensure the fair share with the cluster resources by various dataflow computing applications is extremely important.It is also a main means to reduce the cost of infrastructures.As the drawbacks of traditional cluster resource management have becoming increasingly apparent in dataflow computing model,many alternative cluster resource management,including HoD,centralized scheduling,two-level scheduling,distributed scheduling,and hybrid scheduling management,have been proposed in recent years.Their respective advantages and disadvantages were introduced,and a certain reference for the uses or researches in development of cluster resource management and scheduling in a dataflow computing environment was provided.

Key words: dataflow model, cluster resource, schedule framework, big data

CLC Number:

TP31

Xiaochun TANG, Ying FU, Zhao DING, Anqi MAO, Zhanhuai LI. State-of-art research of cluster resource management in dataflow computing model[J]. Big Data Research, 2020, 6(3): 87-100.

Figures/Tables 6

References 16

[1]	HOVESTADT M , KAO O , KELLER A ,et al. Scheduling in HPC resource management systems:queuing vs planning[J]. Genetica, 2003:112-113(1): 445-461.
[2]	MISHRA M K , PATEL Y S , ROUT Y ,et al. A survey on scheduling heuristics in grid computing environment[J]. International Journal of Modern Education and Computer Science, 2014,6(10): 57-77.
[3]	杜小勇, 陈跃国, 范举 ,等. 数据整理——大数据治理的关键技术[J]. 大数据, 2019,5(3): 13-22.
	DU X Y , CHEN Y G , FAN J ,et al. Data wrangling:a key technique of data governance[J]. Big Data Research, 2019,5(3): 13-22.
[4]	陈康, 郑纬民 . 云计算:系统实例与研究现状[J]. 软件学报, 2009,20(5): 1337-1348.
	CHEN K , ZHENG W M . Cloud computing:system instances and current research[J]. Journal of Software, 2009,20(5): 1337-1348.
[5]	KARANASOS K , RAO S , CURINO C ,et al. Mercury:hybrid centralized and distributed scheduling in large shared clusters[C]// 2015 USENIX Annual Technical Conference. Berkeley:USENIX Association, 2015: 485-497.
[6]	DEAN J , GHEMAWAT S . MapReduce:simplified data processing on large clusters[J]. Communications of the ACM, 2008,51(1): 107-113.
[7]	PARK J J K , PARK Y , MAHLKE S . Dynamic resource management for efficient utilization of multitasking GPUs[C]// The 22nd International Conference on Architectural Support for Programming Languages and Operating Systems. New York:ACM Press, 2017: 527-540.
[8]	ZAHARIA M , CHOWDHURY M , DAS T ,et al. Resilient distributed datasets:a fault-tolerant abstraction for inmemory cluster computing[C]// The 9th USENIX Networked Systems Design and Implementation. Berkeley:USENIX Association, 2012: 2-14.
[9]	ARMBRUST M , XIN R S , LIAN C ,et al. Spark SQL:relational data processing in Spark[C]// The 2015 ACM SIGMOD International Conference on Management of Data. New York:ACM Press, 2015: 1383-1394.
[10]	CARBONE P , KATSIFODIMOS A , EWEN S ,et al. Apache Flink:stream and batch processing in a single engine[J]. IEEE Data Engineering Bulletin, 2015,38(4): 28-38.
[11]	FUKUTOMI D , IIDA Y , AZUMI T ,et al. GPUhd:augmenting YARN with GPU resource management[C]// International Conference on High Performance Computing in Asia-Pacific Region. New York:ACM Press, 2018: 127-136.
[12]	VERMA A , PEDROSA L , KORUPOLU M .et al Large-scale cluster management at Google with Borg[C]// The 10th European Conference on Computer Systems. New York:ACM Press, 2015: 1-17.
[13]	HINDMAN B , KONWINSKI A , ZAHARIA M ,et al. Mesos:a platform for finegrained resource sharing in the data center[C]// The 8th USENIX Conference on Networked Systems Design and Implementation. Berkeley:USENIX Association, 2011: 295-308.
[14]	BOUTIN E , EKANAYAKE J , LIN W ,et al. Apollo:scalable and coordinated scheduling for cloud-scale computing[C]// The 11th USENIX Conference on Operating Systems Design and Implementation. Berkeley:USENIX Association, 2014: 285-300.
[15]	KONSTANTINOS K , SRIRAM R , CARLO C ,et al. Mercury:hybrid centralized and distributed scheduling in large shared clusters[C]// 2015 USENIX Annual Technical Conference. Berkeley:USENIX Association, 2015: 485-497.
[16]	AKIDAU T , BRADSHAW R , CHAMBERS C ,et al. The dataflow model:a practical approach to balancing correctness,latency,and cost in massive-scale,unbounded,out-of-order data processing[J]. Proceedings of the VLDB Endowment, 2015,8(12): 1792-1803.

Metrics

Recommended 0

No Suggested Reading articles found!

State-of-art research of cluster resource management in dataflow computing model

RichHTML

PDF下载

Knowledge

Abstract

Cite this article

share this article

Figures/Tables 6

References 16

Related Articles 15

Metrics

Recommended 0

[1]	Haihong QIAN, Maoyi WANG, Yun XIONG. Digital transformation in higher education:a systematic review [J]. Big Data Research, 2023, 9(3): 56-70.
[2]	Hong MEI, Xiaoyong DU, Hai JIN, Xueqi CHENG, Yunpeng CHAI, Xuanhua SHI, Xiaolong JIN, Yasha WANG, Chi LIU. Big data technologies forward-looking [J]. Big Data Research, 2023, 9(1): 1-20.
[3]	Yang SHEN, Menglong YU. Metaverse and big data: data insight and value connection in spatio-temporal intelligence [J]. Big Data Research, 2023, 9(1): 103-110.
[4]	Tongzheheng ZHENG, Bin LI, Minxuan FENG, Bolin CHANG, Dongbo WANG. Explore the structuration of historical books:the construction and quantitative analysis of digital humanities database of the Biographies of the Shiji [J]. Big Data Research, 2022, 8(6): 40-55.
[5]	Jing CHEN. Humanities big data and its application in the field of digital humanities [J]. Big Data Research, 2022, 8(6): 3-14.
[6]	Yuchu LUO, Hao WU, Yuhan GUO, Shaocong TAN, Can LIU, Ruike JIANG, Xiaoru YUAN. Visualization in digital humanities [J]. Big Data Research, 2022, 8(6): 74-93.
[7]	Wenlong LI, Yuan YUAN, Xiaopeng AN. Modus operandi of big data governance: some preliminary observations [J]. Big Data Research, 2022, 8(4): 34-45.
[8]	Qifeng TANG, Zhiqing SHAO, Yazhen YE. Authenticating and licensing architecture of data rights in data trade [J]. Big Data Research, 2022, 8(3): 40-53.
[9]	Chenhuizi WANG, Wei CAI. Digital economics in metaverse: state-of-the-art, characteristics, and vision [J]. Big Data Research, 2022, 8(3): 140-150.
[10]	Mei YANG, Wei LI, Siyuan QIAO, Wei LIU. Research on calculation method of China’s big data industry output value [J]. Big Data Research, 2022, 8(3): 151-160.
[11]	Deren LI, Guo ZHANG, Yonghua JIANG, Xin SHEN, Weiling LIU. Opportunities and challenges of geo-spatial information science from the perspective of big data [J]. Big Data Research, 2022, 8(2): 3-14.
[12]	Xiaolan QIU, Yuxin HU, Songtao SHANGGUAN, Kun FU. Remote sensing satellite big data high-recision integration processing technology [J]. Big Data Research, 2022, 8(2): 15-27.
[13]	Weiquan LIU, Cheng WANG, Yu ZANG, Qian HU, Shangshu YU, Baiqi LAI. A survey on information extraction technology based on remote sensing big data [J]. Big Data Research, 2022, 8(2): 28-57.
[14]	Jianqiang LIU, Xiaomin YE, Youguo LAN. Remote sensing big data from Chinese ocean satellites and its application service [J]. Big Data Research, 2022, 8(2): 75-88.
[15]	Hequn YANG, Xiaofeng WANG, Yanqing GAO, Yiwen LU, Bingxin MA, Xinyao WANG. Analysis of satellite big data requirements in numerical weather prediction [J]. Big Data Research, 2022, 8(2): 89-102.