通信学报 ›› 2019, Vol. 40 ›› Issue (8): 85-101.doi: 10.11959/j.issn.1000-436x.2019173
李梓杨1,于炯1,2,卞琛3,张译天2,蒲勇霖1,王跃飞1,鲁亮4
修回日期:
2019-07-24
出版日期:
2019-08-25
发布日期:
2019-08-30
作者简介:
李梓杨(1993- ),男,新疆乌鲁木齐人,新疆大学博士生,主要研究方向为分布式系统、内存计算、流式计算。|于炯(1964- ),男,北京人,博士,新疆大学教授、博士生导师,主要研究方向为网格计算、并行计算、分布式系统。|卞琛(1981- ),男,江苏南京人,博士,广东金融学院副教授,主要研究方向为分布式系统、内存计算、绿色计算。|张译天(1995- ),男,河南商丘人,新疆大学硕士生,主要研究方向为云计算、实时计算、分布式计算。|蒲勇霖(1991- ),男,山东淄博人,新疆大学博士生,主要研究方向为内存计算、流式计算、绿色计算。|王跃飞(1991- ),男,新疆乌鲁木齐人,新疆大学博士生,主要研究方向为数据挖掘、机器学习。|鲁亮(1990- ),男,湖南湘潭人,博士,中国民航大学讲师,主要研究方向为分布式系统、内存计算、流式计算。
基金资助:
Ziyang LI1,Jiong YU1,2,Chen BIAN3,Yitian ZHANG2,Yonglin PU1,Yuefei WANG1,Liang LU4
Revised:
2019-07-24
Online:
2019-08-25
Published:
2019-08-30
Supported by:
摘要:
为了解决大数据流式计算平台中存在计算负载波动上升,但集群无法有效应对负载变化的问题,提出了基于流网络的 Flink 平台弹性资源调度策略(FAR-Flink)。该策略首先建立流网络模型并通过构建算法计算每条边的容量值,其次通过弹性资源调度算法确定集群性能瓶颈并制定动态资源调度计划,最后通过基于数据分簇和分桶管理的状态数据迁移算法,实施调度计划并完成节点间的高效数据迁移。实验结果表明,该策略在状态数据复杂的应用场景中有较好的优化效果,在满足计算时延约束的前提下提高了集群的吞吐量,缩短了状态数据迁移的时间。由此可见,FAR-Flink策略有效提升了集群对负载波动的响应能力。
中图分类号:
李梓杨,于炯,卞琛,张译天,蒲勇霖,王跃飞,鲁亮. 基于流网络的Flink平台弹性资源调度策略[J]. 通信学报, 2019, 40(8): 85-101.
Ziyang LI,Jiong YU,Chen BIAN,Yitian ZHANG,Yonglin PU,Yuefei WANG,Liang LU. Flow-network based auto rescale strategy for Flink[J]. Journal on Communications, 2019, 40(8): 85-101.
表3
性能参数配置"
配置项 | 参数值 | 说明 |
jobmanager.heap.size | 2 048 m | 主节点内存 |
taskmanager.heap.size | 2 048 m | 工作节点内存 |
taskmanager.numberOfTaskSlots | 2 | 节点线程数目 |
high-availability | ZooKeeper | 开启HA模式 |
state.backend | rocksdb | 状态数据存储 |
state.backend.incremental | True | 增量式快照 |
taskmanager.network.memory.fraction | 0.2 | 缓冲区大小 |
taskmanager.network.memory.max | 500 m | 缓冲区上限 |
taskmanager.memory.segment-size | 32 768 | 内存分块大小 |
表4
对比实验结果"
系统名称 | 调度策略 | 优点 | 缺点 | 适用场景 |
原系统 | 默认调度策略 | 支持Exactly-Once的有状态数据流处理 | 无弹性资源调度策略 | 计算负载稳定或小幅度波动 |
EN | 通过数学模型计算得出每个算子合理的并行度,并动态增加计算资源 | 数据迁移过程中可同时执行计算任务 | 数据迁移过程的时间开销较高 | 负载持续上升,上升幅度较大,且状态数据规模不大 |
FAR-Flink | 先合理分配上升的计算负载,再通过流网络模型检测需要增加并行度的算子,并动态增加计算资源 | 准确分配计算资源,有效降低数据迁移的时间开销 | 数据迁移时任务有极短暂的停滞(约2~3 s) | 负载持续上升,上升幅度较大,且状态数据规模较大 |
[1] | 彭安妮, 周威, 贾岩 ,等. 物联网操作系统安全研究综述[J]. 通信学报, 2018,39(3): 22-34. |
PENG A N , ZHOU W , JIA Y ,et al. Survey of the Internet of things operating system security[J]. Journal on Communications, 2018,39(3): 22-34. | |
[2] | DEAN J , GHEMAWAT S . MapReduce:simplified data processing on large clusters[J]. Communications of the ACM, 2008,51(1): 107-113. |
[3] | 卞琛, 于炯, 修位蓉 ,等. 基于分配适应度的 Spark 渐进填充分区映射算法[J]. 通信学报, 2017,38(9): 133-147. |
BIAN C , YU J , XIU W R ,et al. Progressive filling partitioning and mapping algorithm for Spark based on allocation fitness degree[J]. Journal on Communications, 2017,38(9): 133-147. | |
[4] | 卞琛, 于炯, 修位蓉 ,等. 内存计算框架局部数据优先拉取策略[J]. 计算机研究与发展, 2017,54(04): 787-803. |
BIAN C , YU J , XIU W R ,et al. Partial data shuffled first strategy for in-memory computing framework[J]. Journal of Computer Research and Development, 2017,54(4): 787-803. | |
[5] | 孙大为 . 大数据流式计算:应用特征和技术挑战[J]. 大数据, 2015,1(3): 99-105. |
SUN D W . Big data stream computing:features and challenges[J]. Big Data Research, 2015,1(3): 99-105. | |
[6] | LOHRMANN B , JANACIK P , KAO O . Elastic stream processing with latency guarantees[C]// IEEE International Conference on Distributed Computing Systems. IEEE, 2015: 399-410. |
[7] | LOHRMANN B , WARNEKE D , KAO O . Nephele streaming:stream processing under QoS constraints at scale[J]. Cluster computing, 2014,17(1): 61-78. |
[8] | VAN D V J S , VAN D W B , LAZOVIK E ,et al. Dynamically scaling Apache Storm for the analysis of streaming data[C]// The 1st IEEE International Conference on Big Data Computing Service and Applications. IEEE, 2015: 154-161. |
[9] | GULISANO V , JIMENEZPERIS , RICARDO ,et al. StreamCloud:an elastic and scalable data streaming system[J]. IEEE Transactions on Parallel & Distributed Systems, 2012,23(12): 2351-2365. |
[10] | WU Y , TAN K L . ChronoStream:elastic stateful stream computation in the cloud[C]// IEEE International Conference on Data Engineering. IEEE Computer Society, 2015: 723-734. |
[11] | HEINZE T , ROEDIGER L , MEISTER A ,et al. Online parameter optimization for elastic data stream processing[C]// The Sixth ACM Symposium on Cloud Computing. ACM, 2015: 276-287. |
[12] | MATTEIS T D , MENCAGLI G . Keep calm and react with foresight:strategies for low-latency and energy-efficient elastic data stream processing[C]// The 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 2016: 1-12. |
[13] | HEINZE T , JERZAK Z , HACKENBROICH G ,et al. Latency-aware elastic scaling for distributed data stream processing systems[C]// The 8th ACM International Conference on Distributed Event-Based Systems. ACM, 2014: 13-22. |
[14] | ZANG Z , RAO R N . DBalancer:a tool for dynamic changing of workers number in storm[C]// The 4th International Conference on Computer Science and Network Technology. 2016: 142-145. |
[15] | SHIEH C K , HUANG S W , SUN L D ,et al. A topology-based scaling mechanism for Apache Storm[J]. International Journal of Network Management, 2017,27(3): 1-12. |
[16] | LI J , PU C , CHEN Y ,et al. Enabling elastic stream processing in shared clusters[C]// The 9th International Conference on Cloud Computing. 2016: 108-115. |
[17] | SUN D W , FU G , LIU X R ,et al. Optimizing data stream graph for big data stream computing in cloud datacenter environments[J]. International Journal of Advancements in Computing Technology, 2014,6(5): 53-65. |
[18] | SUN D W , ZHANG G , YANG S ,et al. Re-Stream:real-time and energy-efficient resource scheduling in big data stream computing environments[J]. Information Sciences, 2015,319: 92-112. |
[19] | XU J L , CHEN Z H , TANG J ,et al. T-Storm:traffic-aware online scheduling in storm[C]// The 34th IEEE International Conference on Distributed Computing Systems. IEEE, 2014: 535-544. |
[20] | VAN D V J S , VAN D W B , LAZOVIK E ,et al. Dynamically scaling Apache Storm for the analysis of streaming data[C]// The 1st IEEE International Conference on Big Data Computing Service and Applications. IEEE, 2015: 154-161. |
[21] | COLLINS R L , CARLONI L P . Flexible filters:load balancing through backpressure for stream programs[C]// The Seventh ACM International Conference on Embedded Software. ACM, 2009: 205-214. |
[22] | ASTRID R , HEISE A , HUESKE F ,et al. SOFA:an extensible logical optimizer for UDF-heavy dataflows[J]. Information Systems, 2015,52: 96-125. |
[23] | KWON Y C , BALAZINSKA M , HOWE B ,et al. Skew-resistant parallel processing of feature-extracting scientific user-defined functions[C]// The 1st ACM symposium on Cloud computing. ACM, 2010: 75-86. |
[24] | 卞琛, 于炯, 修位蓉 ,等. 基于迭代填充的内存计算框架分区映射算法[J]. 计算机应用, 2017,37(3): 41-47. |
BIAN C , YU J , XIU W R ,et al. Partitioning and mapping algorithm for in-memory computing framework based on iterative filling[J]. Journal of Computer Application, 2017,37(3): 41-47. | |
[25] | 李梓杨, 于炯, 卞琛 ,等. 基于流网络的流式计算动态任务调度策略[J]. 计算机应用, 2018,38(9): 2560-2567. |
LI Z Y , YU J , BIAN C ,et al. Dynamic task dispatching strategy for stream processing based on flow network[J]. Journal of Computer Application, 2018,38(9): 2560-2567. | |
[26] | 李梓杨, 于炯, 卞琛 ,等. 基于负载感知的数据流动态负载均衡策略[J]. 计算机应用, 2017,37(10): 2760-2766. |
LI Z Y , YU J , BIAN C ,et al. Dynamic data stream load balancing strategy based on load awareness[J]. Journal of Computer Application, 2017,37(10): 2760-2766. | |
[27] | El K G F , WANAS N M , HEGAZI N H ,et al. A dynamic load balancing framework for real-time applications in message passing systems[J]. International Journal of Parallel Programming, 2011,39(2): 143-182. |
[28] | LOHRMANN B , WARNEKE D , KAO O . Massively-parallel stream processing under QoS constraints with Nephele[C]// The 21st International Symposium on High-Performance Parallel and Distributed Computing. ACM, 2012: 271-282. |
[29] | 鲁亮, 于炯, 卞琛 ,等. 大数据流式计算框架 Storm 的任务迁移策略[J]. 计算机研究与发展, 2018,55(1): 71-92. |
LU L , YU J , BIAN C ,et al. A task migration strategy in big data stream computing with storm[J]. Journal of Computer Research and Development, 2018,55(1): 71-92. | |
[30] | EWEN S , SCHELTER S , TZOUMAS K ,et al. Iterative parallel data processing with stratosphere:an inside look[C]// The 2013 ACM SIGMOD International Conference on Management of Data. ACM, 2013: 1053-1056. |
[31] | PENG B , HOSSEINI M , HONG Z ,et al. R-storm:resource-aware scheduling in storm[C]// The 16th Annual Middleware Conference. ACM, 2015: 149-161. |
[32] | KARIMOV J , RABL T , KATISIFODIMOS A ,et al. Benchmarking distributed stream processing engines[J]. arXiv Preprint,arXiv:1802.08496, 2018. |
[1] | 王莉, 费爱国, 张平, 徐连明. 智能应急指挥通信网络新框架与关键技术研究[J]. 通信学报, 2023, 44(6): 1-11. |
[2] | 赵庶旭, 韦萍, 王小龙. 多任务并发边缘计算环境中最优联盟结构生成策略[J]. 通信学报, 2023, 44(2): 172-184. |
[3] | 何元智, 彭聪, 于季弘, 刘韵. 面向密集多波束组网的卫星通信系统资源调度算法[J]. 通信学报, 2021, 42(4): 109-118. |
[4] | 李梓杨,于炯,王跃飞,卞琛,蒲勇霖,张译天,刘宇. Flink环境下基于负载预测的弹性资源调度策略[J]. 通信学报, 2020, 41(10): 92-108. |
[5] | 苏命峰,王国军,李仁发. 基于利益相关视角的多维QoS云资源调度方法[J]. 通信学报, 2019, 40(6): 102-115. |
[6] | 蒲勇霖,于炯,鲁亮,卞琛,廖彬,李梓杨. storm平台下工作节点的内存电压调控节能策略[J]. 通信学报, 2018, 39(10): 97-117. |
[7] | 王睿,韩笑冬,王超,周晞,龙军. 天基信息网络资源调度与协同管理[J]. 通信学报, 2017, 38(Z1): 104-109. |
[8] | 郭平,宁立江,陈海珠. 满足本地化计算的集群资源调度策略[J]. 通信学报, 2014, 35(Z2): 1-8. |
[9] | 郭 平,宁立江,陈海珠. 满足本地化计算的集群资源调度策略[J]. 通信学报, 2014, 35(Z2): 1-8. |
[10] | 夏纯中1,2,宋顺林1. 基于商空间的层次式数据网格资源调度算法[J]. 通信学报, 2013, 34(6): 18-155. |
[11] | 夏纯中,宋顺林. 基于商空间的层次式数据网格资源调度算法[J]. 通信学报, 2013, 34(6): 146-155. |
[12] | 覃光成,尹浩,陈强,吴泽民,杨盘隆. 面向价值的战场信息处理与分发优化算法[J]. 通信学报, 2011, 32(3): 60-68. |
[13] | 宋莉,胡立栓,肖沣,项彩虹. 基于OGSA的计算资源调度的一种实现[J]. 通信学报, 2005, 26(1A): 163-166. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|