电信科学 ›› 2019, Vol. 35 ›› Issue (10): 151-156.doi: 10.11959/j.issn.1000-0801.2019158

• 运营技术广角 • 上一篇    

基于Kudu的大数据平台实时业务处理能力提升方案

顾飞杨, 孔莹   

  1. 中国电信股份有限公司上海分公司,上海 200042
  • 修回日期:2019-05-26 出版日期:2019-10-20 发布日期:2019-11-03
  • 作者简介:顾飞杨(1978- ),男,中国电信股份有限公司上海分公司企业信息化部数据中心副主任、工程师,主要研究方向为大数据技术以及应用等|孔莹(1982- ),女,中国电信股份有限公司上海分公司企业信息化部数据中心副主任、工程师,主要研究方向为异地跨机房的大数据平台集群、数据快速迁移等

Scheme of enhancing real-time business processing capabilities based on Kudu for the big data platform

Feiyang GU, Ying KONG   

  1. Shanghai Branch of China Telecom Group Co.,Ltd.,Shanghai 200042,China
  • Revised:2019-05-26 Online:2019-10-20 Published:2019-11-03

摘要:

针对目前Hadoop大数据平台实时业务处理能力较差的难点,研究了国际最先进的Kudu列存储作为HDFS块存储的有效补充的理论,阐述了利用Kudu和Spark提供的主键索引和内存加速,有效解决大数据平台无法支持实时入库、增量更新和SQL关联查询等业务痛点的技术实现方法。实验效果证明了方法对提升大数据平台实时业务处理能力的作用。

关键词: Kudu, 大数据, 列存储, 主键索引, 内存加速, 实时入库, 增量更新, SQL关联查询

Abstract:

Aiming at the difficulty of real-time business processing capability of Hadoop big data platform,the theory of the most advanced Kudu column storage was studied as an effective complement of HDFS block storage,and the use of primary key index and memory acceleration provided by Kudu and Spark was illustrated to effectively solve the big data platform cannot support the technical implementation methods of business pain points such as real-time warehousing,incremental update and SQL-related query.The experimental results prove the effect of the method on improving the real-time business processing capability of the big data platform.

Key words: Kudu, big data, column storage, primary key index, memory acceleration, real-time access, incremental update, SQL join query

中图分类号: 

No Suggested Reading articles found!