Journal on Communications ›› 2017, Vol. 38 ›› Issue (9): 133-147.doi: 10.11959/j.issn.1000-436x.2017188

• Papers • Previous Articles     Next Articles

Progressive filling partitioning and mapping algorithm for Spark based on allocation fitness degree

Chen BIAN1,Jiong1 YU1,Wei-rong XIU1,Bin LIAO2,Chang-tian YING1,Yu-rong QIAN1   

  1. 1 College of Software,Xinjiang University,Urumqi 830008,China
    2 College of Statistics and Information,Xinjiang University of Finance and Economics,Urumqi 830012,China
  • Revised:2017-06-14 Online:2017-09-01 Published:2017-10-18
  • Supported by:
    The National Natural Science Foundation of China(61262088);The National Natural Science Foundation of China(61462079);The National Natural Science Foundation of China(61562078);The National Natural Science Foundation of China(61363083);The National Natural Science Foundation of China(61562086);The Natural Science Foundation of Xinjiang Uygur Autonomous Region(2017D01A20);The Educational Re-search Program of Xinjiang Uygur Autonomous Region(XJED2016S106);The Doctoral Research Foundation of Xinjiang University of Finance and Economics(2015BS007)

Abstract:

The job execution mechanism of Spark was analyzed,task efficiency model and Shuffle model were established,then allocation fitness degree (AFD) was defined and the optimization goal was put forward.On the basis of the model definition,the progressive filling partitioning and mapping algorithm (PFPM) was proposed.PFPM established the data distribution scheme adapting Reducers’ computing ability to decrease synchronous latency during Shuffle process and increase cluster the computing efficiency.The experiments demonstrate that PFPM could improve the rationality of workload distribution in Shuffle and optimize the execution efficiency of Spark.

Key words: parallel computing, Spark, progressive filling, partitioning and mapping, allocation fitness degree

CLC Number: 

No Suggested Reading articles found!