Journal on Communications ›› 2014, Vol. 35 ›› Issue (Z1): 14-19.doi: 10.3969/j.issn.1000-436x.2014.z1.004

• Cyberspce Security • Previous Articles     Next Articles

Network log analysis with SQL-on-Hadoop

Si-yu ZHANG1,Kai-da JIANG1,Jian-wen WEI1,Xuan LUO1,Hai-yang WANG2   

  1. 1 Network and Information Center,Shanghai Jiaotong University,Shanghai 200240,China
    2 School of Electronic Information and Electrical Engineering,Shanghai Jiaotong University,Shanghai 200240,China
  • Online:2014-10-25 Published:2017-06-19
  • Supported by:
    The National Natural Science Foundation of China

Abstract:

With the rapid expansion of network bandwidth,devices and applications,log management is facing the challenge of exploding data volumes.Log analysis platform built on SQL-on-Hadoop is capable of storing and querying hundreds of billions of log entries effectively.Columnar and compressed data formats for Hadoop are benchmarked with real-world multi-TB dataset.Conditional and statistical querying efficiency of Hive and Impala is tested.With gzipped parquet format,log data can be compressed by 80%,and querying with impala is 5 times faster.On this platform,six security incident analysis and detection applications are already deployed.

Key words: og analysis, big data, Hadoop, SQL, network security

No Suggested Reading articles found!