通信学报 ›› 2016, Vol. 37 ›› Issue (4): 12-22.doi: 10.11959/j.issn.1000-436x.2016068

• 学术论文 • 上一篇    下一篇

基于HBase数据分类的压缩策略选择方法

王海艳1,2,伏彩航1,2   

  1. 1 南京邮电大学计算机学院,江苏 南京210023
    2 南京邮电大学江苏省无线传感网高技术研究重点实验室,江苏 南京210003
  • 出版日期:2016-04-25 发布日期:2016-04-26
  • 基金资助:
    国家自然科学基金资助项目

Compression strategies selection method based on classification of HBase data

Hai-yan WANG1,2,Cai-hang FU1,2   

  1. 1 School of Computer Science and Technology, Nanjing University of Posts and Telecommunications,Nanjing210023, China
    2 Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks, Nanjing University of Posts and Telecommunications, Nanjing210003, China
  • Online:2016-04-25 Published:2016-04-26
  • Supported by:
    The National Natural Science Foundation of China

摘要:

为解决现有的HBase数据压缩策略选择方法未考虑数据的冷热性,以及在选择过程中存在片面性和不可靠性的缺陷,提出了基于HBase数据分类的压缩策略选择方法。依据数据文件的访问频度将HBase数据划分为冷热数据,并限定具体的访问级别;在此基础上增加评估层,综合考虑基于相邻区和统计列的选择方法,提出基于数据访问级别的压缩策略选择方法。仿真实验及结果表明,提出的压缩策略选择方法不仅节省了存储空间,还大大提高了数据查询的性能。

关键词: 数据压缩, HBase, 压缩策略选择方法, 冷热数据

Abstract:

Most of the current compression strategies selection methods for HBase data did not consider whether the data was cold or hot. Besides, problem of incompleteness and unreliability existed during selection process. To address the problems above, a compression strategies selection method based on classification of HBase data was put forward. HBase data was classified into cold and hot data according to the access frequency of each data file and an access level would be designated to each file. On this base, an evaluation layer was added and a compression strategies selection method based on access level with integration of neighbor sector and statistic column based selection methods. Simulation experiments and results demonstrate that the proposed compression strategies selection method based on classification of HBase data can not only save storage space but also greatly improve the query performance of HBase data.

Key words: data compression, HBase, compression strategies selection method, cold and hot data

No Suggested Reading articles found!