电信科学 ›› 2016, Vol. 32 ›› Issue (7): 115-120.doi: 10.11959/j.issn.1000-0801.2016203

• 研究与开发 • 上一篇    下一篇

基于信息增益的Hadoop瓶颈检测算法

谭造乐1,郝志峰1,蔡瑞初1,肖晓军2,卢宇2   

  1. 1 广东工业大学计算机学院,广东 广州510006
    2 广州优亿信息科技有限公司,广东 广州510630
  • 出版日期:2016-07-20 发布日期:2017-04-26

Hadoop bottleneck detection algorithm based on information gain

Zaole TAN1,Zhifeng HAO1,Ruichu CAI1,Xiaojun XIAO2,Yu LU2   

  1. 1 School of Computers,Guangdong University of Technology,Guangzhou 510006,China
    2 Guangzhou Useease Information Technology Co.,Ltd.,Guangzhou 510630,China
  • Online:2016-07-20 Published:2017-04-26

摘要:

当今,Hadoop已经成为了大数据存储和大数据挖掘的主要平台。虽然Hadoop平台通过分布式的机器集群来实现高性能的并行计算,但由于其由廉价主机组成,故当集群负载增大时,便不可避免地在某机器上出现瓶颈。针对此问题,提出一种基于信息增益的瓶颈检测算法,该算法通过计算各个资源的信息增益来检测集群的瓶颈资源。实验证明了该瓶颈检测算法具有可行性。

关键词: 大数据, Hadoop, 信息增益, 瓶颈检测

Abstract:

Hadoop has become a major platform for big data storage and large data mining nowadays.Although Hadoop platform achieves high performance parallel computing through a distributed cluster of machines,the bottlenecks will inevitably appear on a machine when cluster load increases,because the cluster is composed of inexpensive host.Aiming at this problem,a bottleneck detection algorithms based on information gain was proposed.The algorithm detected cluster's bottlenecks resource by computing the information gain of each resource.The experiments show that the bottleneck detection algorithm is feasible.

Key words: big data, Hadoop, information gain, bottleneck detection

No Suggested Reading articles found!