大数据 ›› 2017, Vol. 3 ›› Issue (1): 80-89.doi: 10.11959/j.issn.2096-0271.2017010

• 应用 • 上一篇    下一篇

基于HBase+ElasticSearch的海量交通数据实时存取方案设计

董长青,任女尔,张庆余,田玉靖   

  1. 北京卡达克数据技术中心软件业务本部,天津 300300
  • 出版日期:2017-01-20 发布日期:2017-03-17
  • 作者简介:董长青(1980-),男,北京卡达克数据技术中心软件业务本部高级工程师,主要研究方向为大数据、车联网。|任女尔(1990-),女,北京卡达克数据技术中心软件业务本部助理工程师,主要研究方向为大数据、云计算。|张庆余(1991-),男,北京卡达克数据技术中心软件业务本部助理工程师,主要研究方向为软件架构、云计算。|田玉靖(1987-),女,北京卡达克数据技术中心软件业务本部中级工程师,主要研究方向为软件架构、编程模式。

Design scheme of massive traffic data real-time access based on HBase and ElasticSearch

Changqing DONG,Nver REN,Qingyu ZHANG,Yujing TIAN   

  1. Software Business Department,Beijing CATARC Data & Technology Center,Tianjin 300300,China
  • Online:2017-01-20 Published:2017-03-17

摘要:

交通流数据具有数据海量、存储和交互速率快等特征,因此其数据的采集、存储及检索成为了车辆远程监控平台中的关键问题。采用LVS集群技术进行数据采集负载均衡,队列缓存处理I/O时延,HBase进行分布式数据存储;针对Hadoop实时在线数据处理不足的问题,整合ElasticSearch并构建了分层索引。通过关键技术的设计和实现,车辆监控由400辆扩展到上万辆,PB级数据在线查询速度提升了10~20倍,验证了方案的高效性。

关键词: Hadoop/HBase, ElasticSearch, Linux虚拟服务器, 海量数据, 实时

Abstract:

Traffic data has the characteristics of massive and real-time,and its massive data acquisition,storage and retrieval has become a key issue in the vehicle remote monitoring platform.According to the study of these problems,the cluster technology of LVS was used to solve the data acquisition load balance,the queue cache model was used to solve I/O delay,and HBase distributed data storage scheme was used to solve the massive data storage.HBase integration ElasticSearch,which was aimed to solve the real-time online data processing problems of Hadoop,was designed to build a hierarchical index.Through the design and implementation of the key technologies,the number of vehicle monitoring had been promoted from 400 to 1 million,online query speed increased about 10 to 20 times based on PB level data.The results verified the efficiency of the scheme.

Key words: Hadoop/HBase, ElasticSearch, Linux virtual server, massive data, real-time

中图分类号: 

No Suggested Reading articles found!