电信科学 ›› 2015, Vol. 31 ›› Issue (5): 138-142.doi: 10.11959/j.issn.1000-0801.2015116

• 运营技术广角 • 上一篇    下一篇

云数据中心集群间网络性能优化的探讨

蒋多元,陈海雄   

  1. 中国石化石油勘探开发研究院 北京 100083
  • 出版日期:2015-05-15 发布日期:2015-08-20

Discussion of Inter-Cluster Network Performance Optimization in Cloud Data Center

Duoyuan Jiang,Haixiong Chen   

  1. Petroleum Exploration & Production Research Institute,SINOPEC,Beijing 100083,China
  • Online:2015-05-15 Published:2015-08-20

摘要:

现代的大规模云计算任务往往需要多个集群协作完成,因此,规划连接集群的网络、保证网络的性能具有重要意义。通过介绍一次集群间网络的性能问题及其解决过程的真实案例,对数据中心集群间网络性能优化进行了探讨。这个案例来源于一个用于大规模实时数据处理的数据中心,在一次大规模并发任务中发现,在带宽并未完全使用的情况下发生了严重的分组丢失现象。基于对拓扑结构的分析,对性能瓶颈进行了定位。通过搭建测试环境并进行实际测试,对性能瓶颈的来源进行了发掘,发现是某交换机在链路未满负荷时由于处理能力和缓存不足造成的。最后,基于模型,分析了各个要素对于性能的影响,并基于分析结果设计了基于增加帧长度的解决方案。

关键词: 集群网络, 性能优化, 吞吐量

Abstract:

Modern large-scale cloud computing tasks often balance the load across multiple clusters. For this reason,to plan the network connecting the clusters and to provide satisfying network performance are of important significance. Based on the description of inter-cluster network performance problem and its resolving process,the inter-cluster network performance optimization was discussed. This case came from a data center for rea1-time large-scale data processing,where the bandwidth was not fully used but severe packet loss happened in case of large-scale concurrent tasks. Based on the analysis of topology,performance bottleneck was located. By building a test environment and the actual testing,the performance bottleneck was found:it was a switch lack of processing power and cache. Finally,the various elements related with the performance were analyzed,and a solution based on increasing the frame length was proposed.

Key words: cluster network, performance optimization, throughput

No Suggested Reading articles found!