通信学报 ›› 2019, Vol. 40 ›› Issue (8): 85-101.doi: 10.11959/j.issn.1000-436x.2019173

• 学术论文 • 上一篇    下一篇

基于流网络的Flink平台弹性资源调度策略

李梓杨1,于炯1,2,卞琛3,张译天2,蒲勇霖1,王跃飞1,鲁亮4   

  1. 1 新疆大学信息科学与工程学院,新疆 乌鲁木齐 830046
    2 新疆大学软件学院,新疆 乌鲁木齐 830008
    3 广东金融学院互联网金融与信息工程学院,广东 广州 510521
    4 中国民航大学计算机科学与技术学院,天津 300300
  • 修回日期:2019-07-24 出版日期:2019-08-25 发布日期:2019-08-30
  • 作者简介:李梓杨(1993- ),男,新疆乌鲁木齐人,新疆大学博士生,主要研究方向为分布式系统、内存计算、流式计算。|于炯(1964- ),男,北京人,博士,新疆大学教授、博士生导师,主要研究方向为网格计算、并行计算、分布式系统。|卞琛(1981- ),男,江苏南京人,博士,广东金融学院副教授,主要研究方向为分布式系统、内存计算、绿色计算。|张译天(1995- ),男,河南商丘人,新疆大学硕士生,主要研究方向为云计算、实时计算、分布式计算。|蒲勇霖(1991- ),男,山东淄博人,新疆大学博士生,主要研究方向为内存计算、流式计算、绿色计算。|王跃飞(1991- ),男,新疆乌鲁木齐人,新疆大学博士生,主要研究方向为数据挖掘、机器学习。|鲁亮(1990- ),男,湖南湘潭人,博士,中国民航大学讲师,主要研究方向为分布式系统、内存计算、流式计算。
  • 基金资助:
    国家自然科学基金资助项目(61862060);国家自然科学基金资助项目(61462079);国家自然科学基金资助项目(61562086);国家自然科学基金资助项目(61562078);国家科技部科技支撑基金资助项目(2015BAH02F01);新疆维吾尔自治区自然科学基金资助项目(2017D01A20);新疆维吾尔自治区高校科研计划基金资助项目(XJEDU2016S106)

Flow-network based auto rescale strategy for Flink

Ziyang LI1,Jiong YU1,2,Chen BIAN3,Yitian ZHANG2,Yonglin PU1,Yuefei WANG1,Liang LU4   

  1. 1 School of Information Science and Engineering,Xinjiang University,Urumqi 830046,China
    2 School of Software,Xinjiang University,Urumqi 830008,China
    3 College of Internet Finance and Information Engineering,Guangdong University of Finance,Guangzhou 510521,China
    4 School of Computer Science and Technology,Civil Aviation University of China,Tianjin 300300,China
  • Revised:2019-07-24 Online:2019-08-25 Published:2019-08-30
  • Supported by:
    The National Natural Science Foundation of China(61862060);The National Natural Science Foundation of China(61462079);The National Natural Science Foundation of China(61562086);The National Natural Science Foundation of China(61562078);The Science and Technology Support Projects of Ministry of National Science and Technology(2015BAH02F01);The Natural Science Foundation of Xinjiang Uygur Autonomous Region of China(2017D01A20);Educational Research Program of Xinjiang Uygur Autonomous Region of China(XJEDU2016S106)

摘要:

为了解决大数据流式计算平台中存在计算负载波动上升,但集群无法有效应对负载变化的问题,提出了基于流网络的 Flink 平台弹性资源调度策略(FAR-Flink)。该策略首先建立流网络模型并通过构建算法计算每条边的容量值,其次通过弹性资源调度算法确定集群性能瓶颈并制定动态资源调度计划,最后通过基于数据分簇和分桶管理的状态数据迁移算法,实施调度计划并完成节点间的高效数据迁移。实验结果表明,该策略在状态数据复杂的应用场景中有较好的优化效果,在满足计算时延约束的前提下提高了集群的吞吐量,缩短了状态数据迁移的时间。由此可见,FAR-Flink策略有效提升了集群对负载波动的响应能力。

关键词: 流式计算, 资源调度, 弹性集群, 负载迁移, Flink

Abstract:

In order to solve the problem that the load of big data stream computing platform is increasing with fluctuation while the cluster was not able to rescale efficiently,the Flow-network based auto rescale strategy for Flink was proposed.Firstly,the flow-network model was set up and the capacity of each edge that was calculated by self-learning algorithm.Secondly,the bottleneck of the cluster was acquired by maximum-flow algorithm and the resource rescheduling plan was drawn up.Finally,the resource rescheduling plan was executed and the stateful data was migrated efficiently by the data migration algorithm based on the strategy of data partitioning by bulk and bucket.The experimental results show that the strategy can effectively provide performance promotion in the application with complex stateful data.It improved the throughput of the cluster and reduced the time overhead of the data migration on the premise of satisfying the latency constrain of the application,which means that the strategy promotes the scalability of the cluster efficiently.

Key words: stream computing, resource scheduling, elastic cluster, load migration, Flink

中图分类号: 

No Suggested Reading articles found!