通信学报 ›› 2019, Vol. 40 ›› Issue (12): 68-85.doi: 10.11959/j.issn.1000-436x.2019226
蒲勇霖1,于炯1,鲁亮2,李梓杨1,卞琛3,廖彬4
修回日期:
2019-10-19
出版日期:
2019-12-25
发布日期:
2020-01-16
作者简介:
蒲勇霖(1991- ),男,山东淄博人,新疆大学博士生,主要研究方向为流式计算、绿色计算、内存计算等|于炯(1964- ),男,新疆乌鲁木齐人,博士,新疆大学教授、博士生导师,主要研究方向为并行计算、分布式系统、绿色计算等|鲁亮(1990- ),男,天津人,博士,中国民航大学讲师,主要研究方向为分布式系统、内存计算、绿色计算|李梓杨(1993- ),男,新疆乌鲁木齐人,新疆大学博士生,主要研究方向为流式计算、内存计算等|卞琛(1981- ),男,江苏南京人,博士,广东金融学院副教授,主要研究方向为分布式系统、内存计算、绿色计算等|廖彬(1986- ),男,新疆乌鲁木齐人,博士,新疆财经大学副教授、硕士生导师,主要研究方向为分布式系统、数据库理论与技术、绿色计算等
基金资助:
Yonglin PU1,Jiong YU1,Liang LU2,Ziyang LI1,Chen BIAN3,Bin LIAO4
Revised:
2019-10-19
Online:
2019-12-25
Published:
2020-01-16
Supported by:
摘要:
针对Storm存在低效率、高能耗的问题,通过分析Storm平台的基本框架与拓扑结构,设计了资源约束模型、最优线程数据重组原则和节点降压原则,并在此基础上提出了基于 Storm 平台的数据迁移合并节能策略(DMM-Storm),包括资源约束算法、数据迁移合并算法和节点降压算法。其中资源约束算法根据资源约束模型,判断工作节点是否允许数据的迁移;数据迁移合并算法根据最优线程数据重组原则,设计了最优的线程数据迁移方法;节点降压算法根据节点降压限制条件,降低了工作节点的电压。实验结果表明,与现有的节能策略相比,执行DMM-Storm在不影响集群性能的前提下,有效降低了能耗。
中图分类号:
蒲勇霖,于炯,鲁亮,李梓杨,卞琛,廖彬. 基于Storm平台的数据迁移合并节能策略[J]. 通信学报, 2019, 40(12): 68-85.
Yonglin PU,Jiong YU,Liang LU,Ziyang LI,Chen BIAN,Bin LIAO. Energy-efficient strategy for data migration and merging in Storm[J]. Journal on Communications, 2019, 40(12): 68-85.
表1
Storm集群配置"
节点 | CPU | 内存 | 网络带宽 |
NimbusZooKeeper1 (Leader) | Intel core i7 4790 3.6 GHz Quad Core | 8 GB DDR3 1 066 MHz | 100 Mbit/s LAN |
Supervisor1~Supervisor16 | Intel core i7 4790 3.6 GHz Quad Core | 8 GB DDR3 1 066 MHz | 100 Mbit/s LAN |
Zookeeper2、Zookeeper3 (Follower) | Intel core i7 4790 3.6 GHz Quad Core | 8 GB DDR3 1 066 MHz | 100 Mbit/s LAN |
表3
基准测试参数配置"
基准测试 | 参数 | 数值 |
component.spout_num | 60 | |
component.split_bolt_num | 120 | |
WordCount | component.count_bolt_num | 120 |
topology.works | 16 | |
topology.acker.executors | 16 | |
topology.max.spout.pending | 200 | |
component.spout_num | 60 | |
component.sort_bolt_num | 120 | |
RollingSort | emit.frequency | 10 |
chunk.size | 2 000 000 | |
message.size | 100 000 | |
topology.level | 3 | |
Sol | message.size | 2 000 |
component.spout_num | 60 | |
component.bolt_num | 120 | |
component.spout_num | 60 | |
component.split_bolt_num | 120 | |
RollingCount | component.rolling_count_bolt_num | 120 |
window.length | 150 | |
emit.frequency | 30 |
表5
集群执行不同基准测试各类资源的占用率"
基准测试 | 算法 | CPU占用率 | 网络带宽占用率 | 内存占用率 |
原系统 | 53.8% | 49.4% | 53.5% | |
RollingCount | DMMNE | 72.5% | 70.5% | 73.5% |
DMMCE | 73.1% | 69.2% | 74.3% | |
原系统 | 72.6% | 71.3% | 61.4% | |
WordCount | DMMNE | 86.3% | 88.3% | 80.7% |
DMMCE | 85.2% | 89.2% | 79.8% | |
原系统 | 24.5% | 69.6% | 51.3% | |
Sol | DMMNE | 60.6% | 88.7% | 72.8% |
DMMCE | 63.4% | 87.3% | 71.9% | |
原系统 | 58.7% | 51.2% | 49.8% | |
RollingSort | DMMNE | 72.4% | 75.4% | 74.8% |
DMMCE | 73.7% | 76.4% | 75.6% |
表6
集群稳定后的功耗统计"
实验组 | 最小功耗/W | 最大功耗/W | 平均功耗/W |
test11 | 1 000.787 46 | 1 183.516 94 | 1 083.995 012 |
test12 | 840.833 95 | 985.700 53 | 905.268 979 5 |
test13 | 820.316 05 | 985.104 32 | 911.453 409 2 |
test21 | 1 015.426 56 | 1 195.926 06 | 1 125.715 989 |
test22 | 867.373 92 | 936.333 72 | 915.091 409 7 |
test23 | 852.070 68 | 947.363 31 | 911.314 227 2 |
test31 | 962.508 52 | 1192.577 08 | 1 076.023 367 |
test32 | 856.864 18 | 965.197 93 | 926.058 879 5 |
test33 | 832.394 69 | 959.708 03 | 905.334 361 6 |
[1] | 孟小峰, 慈祥 . 大数据管理:概念、技术与挑战[J]. 计算机研究与发展, 2013,50(1): 146-169. |
MENG X F , CI X . Big data management:concepts,techniques and challenges[J]. Journal of Computer Research and Development, 2013,50(1): 146-169. | |
[2] | RANJAN R . Streaming big data processing in datacenter clouds[J]. IEEE Cloud Computing, 2014,1(1): 78-83. |
[3] | CHEN C L P , ZHANG C Y . Data-intensive applications,challenges,techniques and technologies:a survey on big data[J]. Information Sciences, 2014,275(11): 314-347. |
[4] | 孙大为 . 大数据流式计算:应用特征和技术挑战[J]. 大数据, 2015,1(3): 99-105. |
SUN D W . Big data stream computing:features and challenges[J]. Big Data Research, 2015,1(3): 99-105. | |
[5] | KAMBATLA K , KOLLIAS G , KUMAR V ,et al. Trends in big data analytics[J]. Journal of Parallel and Distributed Computing, 2014,74(7): 2561-2573. |
[6] | 杨挺, 王萌, 张亚健 ,等. 云计算数据中心 HDFS 差异性存储节能优化算法[J]. 计算机学报, 2019(4): 721-735. |
YANG T , WANG M , ZHANG Y J ,et al. HDFS differential storage energy-saving optimal algorithm in cloud data center[J]. Chinese Journal of Computers, 2019(4): 721-735. | |
[7] | 余晓晖, . 数据中心能效测评指南[R]. “云计算发展与政策论坛”技术报告,(2012-03-16)[2019-07-04]. |
YU X H . Data center energy efficiency assessment guide[R]. Cloud Computing Development and Policy Forum Technical Report,(2012-03-16)[2019-07-04]. | |
[8] | 陈小燕, 干丽萍, 郭文平 . 大数据可视化工具比较及应用[J]. 计算机教育, 2018,282(6): 100-105. |
CHEN X Y , GAN L P , GUO W P . Comparison and application of big data visualization tools[J]. Computer Education, 2018,282(6): 100-105. | |
[9] | SUN D , ZHANG G , YANG S ,et al. Re-Stream:real-time and energy-efficient resource scheduling in big data stream computing environments[J]. Information Sciences, 2015,319: 92-112. |
[10] | 鲁亮, 于炯, 卞琛 ,等. 大数据流式计算框架 Storm 的任务迁移策略[J]. 计算机研究与发展, 2018,55(1): 71-92. |
LU L , YU J , BIAN C ,et al. A task migration strategy in big data stream computing with Storm[J]. Journal of Computer Research and Development, 2018,55(1): 71-92. | |
[11] | BORTHAKUR D , GRAY J , SARMA J S ,et al. Apache Hadoop goes realtime at Facebook[C]// The 2011 ACM SIGMOD International Conference on Management of Data. ACM, 2011: 1071-1080. |
[12] | NEUMEYER L , ROBBINS B , NAIR A , KESARI A . S4:distributed stream computing platform[C]// The 10th IEEE International Conference on Data Mining Workshops (ICDMW 2010). IEEE, 2010: 170-177. |
[13] | 李梓杨, 于炯, 卞琛 ,等. 基于流网络的Flink平台弹性资源调度策略[J]. 通信学报, 2019,40(8): 85-101. |
LI Z Y , YU J , BIAN C ,et al. Flow-network based auto rescale strategy for Flink[J]. Journal on Communications, 2019,40(8): 85-101. | |
[14] | 卞琛, 于炯, 修位蓉 ,等. 基于分配适应度的 Spark 渐进填充分区映射算法[J]. 通信学报, 2017,38(9): 133-147. |
BIAN C , YU J , XIU W R ,et al. Progressive filling partitioning and mapping algorithm for Spark based on allocation fitness degree[J]. Journal on Communications, 2017,38(9): 133-147. | |
[15] | KULKARNI S , BHAGAT N , FU M ,et al. Twitter heron:Stream processing at scale[C]// The 2015 ACM SIGMOD International Conference on Management of Data. ACM, 2015: 239-250. |
[16] | ANDERSON Q . Storm real-time processing cookbook[M]. Birmingham: Packt PublishingPress, 2013: 4-8. |
[17] | TA V D , LIU C M , NKABINDE G W . Big data stream computing in healthcare real-time analytics[C]// The 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA). IEEE, 2016: 37-42. |
[18] | MISHNE G , DALTON J , LI Z ,et al. Fast data in the era of big data:Twitter’s real-time related query suggestion architecture[C]// The 2013 ACM SIGMOD International Conference on Management of Data. ACM, 2013: 1147-1158. |
[19] | DING W L , HAN Y B , ZHAO Z F ,et al. Stream-oriented availability services for endpoint-to-endpoint data transmission[C]// The 2012 In ternational Conference on Cloud and Service Computing. IEEE, 2012: 212-218. |
[20] | SHIN D J , PARK S K , KIM S M ,et al. Adaptive page grouping for energy efficiency in hybrid PRAM-DRAM main memory[C]// ACM Research in Applied Computation Symposium. ACM, 2012: 395-402. |
[21] | BONAMY R , BILAVARN S , MULLER F . An energy-aware scheduler for dynamically reconfigurable multi-core systems[C]// International Symposium on Reconfigurable Communication-Centric Systems-On-Chip. IEEE, 2015: 1-6. |
[22] | KIM H S , SHIN D I , YU Y J ,et al. Towards energy proportional cloud for data processing frameworks[M]. San Jose: USENIX AssociationPress, 2010: 1-8. |
[23] | FAISAL S M , TZIANTZIOULIS G , GOK A M ,et al. Edge importance identification for energy efficient graph processing[C]// IEEE International Conference on Big Data. IEEE, 2015: 347-354. |
[24] | SONG J , MA Z , THOMAS R ,et al. Energy efficiency optimization in big data processing platform by improving resources utilization[J]. Sustainable Computing:Informatics and Systems, 2019,21: 80-89. |
[25] | MU J , PEI Y , LI W ,et al. Research on energy saving optimization strategy of substation operation based on big data technology[C]// 2018 Chinese Control And Decision Conference (CCDC). IEEE, 2018: 3567-3571. |
[26] | DE MATTEIS T , MENCAGLI G . Keep calm and react with foresight:strategies for low-latency and energy-efficient elastic data stream processing[J]. Journal of Systems and Software, 2016,51(8): 1-12. |
[27] | LEVERICH J , KOZYRAKIS C . On the energy (in) efficiency of Hadoop clusters[J]. ACM SIGOPS Operating Systems Review, 2010,44(1): 61-65. |
[28] | LANG W , PATEL J M . Energy management for MapReduce clusters[J]. Proceedings of the VLDB Endowment, 2010,3(1-2): 129-139. |
[29] | 宋杰, 李甜甜, 朱志良 ,等. 云数据管理系统能耗基准测试与分析[J]. 计算机学报, 2017,36(7): 1485-1499. |
SONG J , LI T T , ZHU Z L ,et al. Benchmarking and analyzing the energy consumption of cloud data management system[J]. Chinese Journal of Computers, 2013,36(7): 1485-1499. | |
[30] | 廖彬, 张陶, 于炯 ,等. MapReduce 能耗建模及优化分析[J]. 计算机研究与发展, 2016,53(9): 2107-2131. |
LIAO B , ZHANG T , YU J ,et al. Energy consumption modeling and optimization analysis for MapReduce[J]. Journal of Computer Research and Development, 2016,53(9): 2107-2131. | |
[31] | LIAO B , YU J , ZHANG T ,et al. Energy-efficient algorithms for distributed storage system based on block storage structure reconfiguration[J]. Journal of Computer Research & Development, 2015,48(2): 71-86. |
[32] | SHIN D J , PARK S K , KIM S M ,et al. Adaptive page grouping for energy efficiency in hybrid PRAM-DRAM main memory[C]// ACM Research in Applied Computation Symposium. ACM, 2012: 395-402. |
[33] | ZHOU S , CHELMIS C , PRASANNA V K . High-Throughput and Energy-Efficient Graph Processing on FPGA[C]// International Symposium on Field-Programmable Custom Computing Machines. IEEE, 2016: 103-110. |
[34] | 廖彬, 张陶, 于炯 ,等. 温度感知的MapReduce节能任务调度策略[J]. 通信学报, 2016,37(1): 61-75. |
LIAO B , ZHANG T , YU J . Temperature aware energy-efficient task scheduling strategies for MapReduce[J]. Journal on Communications, 2016,37(1): 61-75. | |
[35] | VASUDEVAN V , FRANKLIN J , ANDERSEN D . FAWN damentally power-efficient clusters[C]// The 12th Workshop on Hot Topics in Operating Systems (HotOS 09?). Usenix Association, 2009: 1-5. |
[36] | 廖彬, 于炯, 孙华 ,等. 基于存储结构重配置的分布式存储系统节能算法[J]. 计算机研究与发展, 2013,50(1): 3-18. |
LIAO B , YU J , SUN H ,et al. Energy-efficient algorithms for distributed storage system based on data storage structure reconfiguration[J]. Journal of Computer Research and Development, 2013,50(1): 3-18. | |
[37] | GUO B , YU J , LIAO B ,et al. A green framework for DBMS based on energy-aware query optimization and energy-efficient query processing[J]. Journal of Network and Computer Applications, 2017,84: 118-130. |
[38] | WANG Z , WANG H , ZHAO W ,et al. Energy optimization of parallel programs in a heterogeneous system by combining processor core-shutdown and dynamic voltage scaling[J]. Future Generation Computer Systems, 2019,92: 198-209. |
[39] | CORDESCHI N , SHOJAFAR M , AMENDOLA D ,et al. Energy-efficient adaptive networked datacenters for the QoS support of real-time applications[J]. The Journal of Supercomputing, 2014,71(2): 448-478. |
[40] | PANDA A , CHATHA K S . An embedded architecture for energy-efficient stream computing[J]. IEEE Embedded Systems Letters, 2014,6(3): 57-60. |
[41] | ZONG Z , MANZANARES A , RUAN X ,et al. EAD and PEBD:two energy-aware duplication scheduling algorithms for parallel tasks on homogeneous clusters[J]. IEEE Transactions on Computers, 2010,60(3): 360-374. |
[42] | 蒲勇霖, 于炯, 鲁亮 ,等. Storm平台下工作节点的内存电压调控节能策略[J]. 通信学报, 2018,39(10): 101-121. |
PU Y L , YU J , LU L ,et al. Energy-efficient strategy for work node by DRAM voltage regulation in Storm[J]. Journal on Communications, 2018,39(10): 101-121. |
[1] | 鲁蔚锋, 李宁, 徐佳, 徐力杰, 徐建. 多接入边缘计算中相关性任务的联合调度算法[J]. 通信学报, 2023, 44(4): 87-98. |
[2] | 余雪勇, 邱礼翔, 宋家宁, 朱洪波. 无人机辅助边缘计算中安全通信与能效优化策略[J]. 通信学报, 2023, 44(3): 45-54. |
[3] | 金伟, 李凤华, 余铭洁, 郭云川, 周紫妍, 房梁. 面向HDFS的密钥资源控制机制[J]. 通信学报, 2022, 43(9): 27-41. |
[4] | 李翠然, 王雪洁, 谢健骊, 吕安琪. 基于改进PSO的铁路监测线性无线传感器网络路由算法[J]. 通信学报, 2022, 43(5): 155-165. |
[5] | 胡九川, 范东睿, 程建聪, 严龙, 叶笑春, 李灵枝, 万良易, 钟海斌. 内存与片上渗透缓存之间数据迁移的理论分析[J]. 通信学报, 2021, 42(8): 217-225. |
[6] | 毛伊敏, 邓千虎, 陈志刚. 基于信息熵与遗传算法的并行关联规则增量挖掘算法[J]. 通信学报, 2021, 42(5): 122-136. |
[7] | 袁亮, 俞啸, 丁恩杰, 赵小虎, 冯仕民, 张达, 刘统玉, 王卫东, 黄艳秋. 矿山物联网人-机-环状态感知关键技术研究[J]. 通信学报, 2020, 41(2): 1-12. |
[8] | 任品毅,许茜. 基于移动边缘计算的时延能耗最小化安全传输[J]. 通信学报, 2020, 41(11): 52-63. |
[9] | 杨鹏,李幼平. 支持内容智能治理的双结构互联网[J]. 通信学报, 2019, 40(9): 1-14. |
[10] | 石乐义,郭宏彬,温晓,李剑蓝,崔玉文,马猛飞,孙慧. 端信息跳扩混合的主动网络防御技术研究[J]. 通信学报, 2019, 40(5): 125-135. |
[11] | 付钰, 俞艺涵, 吴晓平. 大数据环境下差分隐私保护技术及应用[J]. 通信学报, 2019, 40(10): 157-168. |
[12] | 刘伟,熊曙,杜薇,王伟. 移动云环境中数据流应用的Cloudlet选择策略研究[J]. 通信学报, 2019, 40(1): 87-101. |
[13] | 毛艳艳,程大鹏,冯烟利,窦全胜,李大社. C3S:基于相长干涉的智能传感系统并发传输策略研究[J]. 通信学报, 2019, 40(1): 180-194. |
[14] | 梁俊斌,周翔,李陶深. 移动低占空比无线传感网中低能耗的主动邻居发现算法[J]. 通信学报, 2018, 39(4): 45-55. |
[15] | 蒲勇霖,于炯,鲁亮,卞琛,廖彬,李梓杨. storm平台下工作节点的内存电压调控节能策略[J]. 通信学报, 2018, 39(10): 97-117. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|