电信科学 ›› 2014, Vol. 30 ›› Issue (10): 48-51.doi: 10.3969/j.issn.1000-0801.2014.10.009

• 专题:大数据技术与应用 • 上一篇    下一篇

基于Spark平台的NetFIow流量分析系统

丁圣勇,闵世武,樊勇兵   

  1. 中国电信股份有限公司广东研究院 广州 510630
  • 出版日期:2014-10-15 发布日期:2017-06-29

A Large Scale NetFlow Analysis System Based on Spark

Shengyong Ding,Shiwu Min,Yongbing Fan   

  1. Guangdong Research Institute of China Telecom Co., Ltd., Guangzhou 510630, China
  • Online:2014-10-15 Published:2017-06-29

摘要:

目前典型的NetFlow分析系统多为基于私有架构或平台的第三方系统,面临扩展性较低、开放性不足、扩容代价大、分析时延长等问题。大数据技术的快速发展尤其是内存式计算平台如Spark的出现为集中处理大规模NetFlow数据提供了可能,本文提出了基于Spark的NetFlow分析系统,验证了核心算法(如流量应用构成统计)在Spark平台的性能。实验表明,基于Spark的NetFlow分析系统具有很高的性能和很强的扩展能力,较之Hadoop MapReduce有显著的性能提升。

关键词: NetFlow, Spark, 流量分析

Abstract:

The existing systems usually adopt private distributed architectures, which face scalability, openness, cost and latency problems. The development of big data technology such as Spark offers new opportunity for large scale NetFlow processing systems. A new analysis system based on Spark platform was proposed and the effectiveness of the method was verified. The experimental results show its superior performance.

Key words: NetFlow, Spark, traffic analysis

No Suggested Reading articles found!