Chinese Journal of Network and Information Security ›› 2020, Vol. 6 ›› Issue (6): 1-12.doi: 10.11959/j.issn.2096-109x.2020072
• Special Column:Network Application and Protection Technology • Next Articles
Yingjun ZHANG1(),Ushangqi LI2,Mu YANG2,Haixia ZHANG1,Kezhen HUANG1
Revised:
2020-09-24
Online:
2020-12-15
Published:
2020-12-16
Supported by:
CLC Number:
Yingjun ZHANG,Ushangqi LI,Mu YANG,Haixia ZHANG,Kezhen HUANG. Survey on anomaly detection technology based on logs[J]. Chinese Journal of Network and Information Security, 2020, 6(6): 1-12.
"
Eid | Name | Message | Frequency | Parameters |
1 | server.ZooKeeperServer | Server environment | 2 | host.name=localhost,user.home=/home/hadoop |
2 | server.NIOServerCnxnFactory | binding to port | 1 | 0.0.0.0/0.0.0.0:2181 |
3 | server.NIOServerCnxnFactory | Accepted socket connection | 1 | 192.168.31.154:38221 |
4 | server.ZooKeeperServer | Client attempting to establish new | 1 | 192.168.31.154:38221 |
session | ||||
5 | server.ZooKeeperServer | Established session | 1 | 0x1621970549a0000 |
… | … | … | … |
"
名称 | 类别 | 使用算法 | 模式 | 准确性 | 时间复杂度 | 效率 |
CFG[ | 代码分析 | AST,CFG | 离线 | ++ | O(n3) | +++ |
PCA[ | 代码分析 | AST | 在线 | +++ | O(n) | +++ |
CLSTR[ | 机器学习 | IPLoM | 离线 | ++ | O(n) | +++ |
LKE[ | 机器学习 | Clustering | 离线 | ++ | O(n2) | + |
Logram[ | 自然语言处理 | n-gram | 在线 | +++ | O(n) | +++ |
NLog[ | 自然语言处理 | POS | 离线 | ++ | O(n) | +++ |
Spell[ | 经典算法 | LCS | 在线 | +++ | O(n) | +++ |
Drain[ | 经典算法 | 解析树 | 在线 | +++ | O(n) | +++ |
"
名称 | 分类 | 使用的算法 | 模式 | 准确率 | 召回率 |
AClog[ | 监督学习 | SVM,LCS | 离线 | 92.4% | 80% |
IM[ | 监督学习 | K-prototype,kNN | 离线 | 89% | 85% |
LogClass[ | 监督学习 | PU,SVM | 在线 | 99.048% | 99.988% |
LogCluster[ | 无监督学习 | 聚类算法 | 在线 | 60% | 36.2% |
MCL[ | 无监督学习 | PCA | 在线 | 99.8% | / |
LACT[ | 无监督学习 | TCA,NLP | 离线 | 97.08% | 95.45% |
CausalConvLSTM[ | 深度学习 | CNN,LSTM | 离线 | 89.59% | 99.72% |
DeepLog[ | 深度学习 | LSTM | 在线 | 95% | 96% |
LogGAN[ | 深度学习 | LSTM | 离线 | 100% | 35.6% |
LogRobust[ | 深度学习 | Bi-LSTM | 在线 | 98% | 100% |
[61] | RISTO V , BERNHARDS B , MARKUS K . An unsupervised framework for detecting anomalous messages from syslog log files[C]// Network Operations and Management Symposium. 2018: 1-6. |
[62] | LOU J G , FU Q , YANG S Q ,et al. Mining invariants from console logs for system problem detection[C]// USENIX Annual Technical Conference. 2010: 1-14. |
[63] | YUAN Y , ANU H , SHI W C ,et al. Learning-based anomaly cause tracing with synthetic analysis of logs from multiple cloud service components[C]// Computer Software and Applications Conference. 2019: 66-71. |
[64] | BIBLOP D , MOHIUDDIN S , MUHAMMADALI G ,et al. LogLens:a real-time log analysis system[C]// International Conference on Distributed Computing Systems. 2018: 1052-1062. |
[65] | DUNIA R , QIN J S . Multi-dimensional fault diagnosis using a subspace approach[C]// ACC. 1997: 1-5. |
[66] | PAPINENI K , . Why inverse document frequency?[C]// NAACL ’01. 2001: 1-8. |
[1] | 廖湘科, 李姗姗 . 大规模软件系统日志研究综述[J]. 软件学报, 2016,27(8): 1934-1947. |
LIAO X K , LI S S . Survey on log research of large scale software system[J]. Journal of Software, 2016,27(8): 1934-1947. | |
[67] | ASTEKIN M , OZCAN S , SOZER H . Incremental analysis of large-scale system logs for anomaly detection[C]// International Conference on Big Data. 2019: 2119-2127. |
[68] | ASTEKIN M , ZENGIN H , S?ZER H . Evaluation of distributed machine learning algorithms for anomaly detection from large-scale system logs:a case study[C]// 2018 IEEE International Conference on Big Data (Big Data). 2018: 2071-2077. |
[2] | OLINER A J , GANAPATHI A , XU W . Advances and challenges in log analysis[J]. Communications of the ACM, 2012,55(2): 55-61. |
[3] | RAPIDS. cyBERT:neural network,that’s the tech; to free your staff from,bad regex[EB]. |
[69] | BROWN A , TUOR A , HUTCHINSON B ,et al. Recurrent neural network attention mechanisms for interpretable system log anomaly detection[C]// MLCS 2018. 2018: 1-8. |
[70] | BERTERO C , ROY M , SAUVANAUD C ,et al. Experience report:log mining using natural language processing and application to anomaly detection[C]// International Symposium on Software Reliability Engineering(2017). 2017: 351-360. |
[4] | MI H , WANG H , ZHOU Y ,et al. Toward finegrained,unsupervised,scalable performance diagnosis for production cloud computing systems[J]. IEEE Trans Parallel Distrib Syst, 2013,24(6): 1245-1255. |
[5] | 崔元, 张琢 . 基于大规模网络日志的模板提取研究[J]. 计算机科学, 2017,44(11A): 448-452. |
[71] | AMEY W , TANISHQ G , ROHIT V ,et al. Hybrid CAE-VAE for unsupervised anomaly detection in log file systems[C]// International Conference on Computing Communication and Networking Technologies. 2019: 1-7. |
[72] | YOON-HO C , PENG L , SHANG Z T ,et al. Using deep learning to solve computer security challenges:a survey[J]. arXiv:Cryptography and Security. 2019. |
[5] | CUI Y , ZHANG Z . Research on Template Extraction Based on Large-scale Network Log[J]. Computer Science, 2017,44(11A): 448-452. |
[6] | HE P , ZHU J , HE S ,et al. An evaluation study on log parsing and its use in log mining[C]// Proc of the 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. 2016: 654-661. |
[73] | STEVEN Y , MELODY M , TENG-SHENG M . CausalConvLSTM:semi-supervised log anomaly detection through sequence modeling[C]// International Conference on Machine Learning and Applications. 2019: 1334-1341. |
[74] | DU M , LI F F , VIVEK S . DeepLog:anomaly detection and diagnosis from system logs through deep learning[C]// CCS. 2017: 1285-1298. |
[7] | HE SL , ZHU J M , HE P J ,et al. Experience report:system log analysis for anomaly detection[C]// 27th International Symposium on Software Reliability Engineering. 2016: 207-218. |
[8] | FU Q , LOU J , et al . Contextual analysis of program logs for understanding system behaviors[C]// MSR ’13. 2013: 397-400. |
[9] | PECCHIA A , COTRONEO D , KALBARCZYK Z ,et al. Improving log-based field failure data analysis of multi-node computing systems[C]// Dependable Systems and Networks. 2011: 97-108. |
[75] | XIA B , YIN J J , XU J ,et al. LogGAN:a sequence-based generative adversarial network for anomaly detection based on system logs[C]// SciSec 2019:Science of Cyber Security,Switzerland. 2019: 61-76. |
[76] | ZHANG XU , XU Y , ZHANG H Y ,et al. Robust log-based anomaly detection on unstable log data[C]// ESEC/FSE’19. 2019: 807-817. |
[10] | LU J , LI F , LI L ,et al. CloudRaid:hunting concurrency bugs in the cloud via log-mining[C]// Foundations of Software Engineering, 2018: 3-14. |
[11] | AIT EL HADJ M , KHOUMSI A , BENKAOUZ Y ,et al. Efficient security policy management using suspicious rules through accesslog analysis[J]. Lecture Notes in Computer Science, 2019,11704: 250-266. |
[12] | STUDIAWAN H , FERDOUS S , PAYNE C . A survey on forensic investigation of operating system logs[J]. Digital Investigation, 2019,29: 1-20. |
[77] | WANG X , WANG D , ZHANG Y ,et al. Unsupervised learning for log data analysis based on behavior and attribute features[C]// International Conference on Artificial Intelligence. 2020: 510-518. |
[78] | 梅御东, 陈旭, 孙毓忠 ,等. 一种基于日志信息和CNN-text的软件系统异常检测方法[J]. 计算机学报, 2020,43(2): 366-380. |
MEI Y D , CHEN X , SUN Y Z ,et al. A method for software system anmaly detection based on log information and CNN-Text[J]. Chinese Journal of Computers, 2020,43(2): 366-380. | |
[79] | LU S Y , WEI X , LI Y D ,et al. Detecting anomaly in big data system logs using convolutional neural network[C]// Dependable Autonomic and Secure Computing. 2018: 151-158. |
[13] | CHOW M , MEISNER D , FLINN J ,et al. The mystery machine:end-to-end performance analysis of large-scale internet services[C]// 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’14). 2014: 217-231. |
[14] | KARTHIK N , CHARLES K , JENNIFER N . Structured comparative analysis of systems logs to diagnose performance problems[C]// Networked Systems Design and Implementation. 2012: 26-26. |
[15] | CHANDOLA V , BANERJEE A , KUMAR V . anomaly detection:a survey[J]. ACM Computing Surveys, 2009,41(3): 30602-30611. |
[16] | KULKARNI J , JOSHI S , BAPAT S ,et al. Analysis of system logs for pattern detection and anomaly prediction[C]// Proceeding of International Conference on Computational Science and Applications. 2020: 427-436. |
[17] | ARIEL R , RANDY K . Chukwa:a system for reliable large-scale log collection[C]// Usenix Large Installation Systems Administration Conference. 2010: 1-15. |
[18] | ZHU J M , HE S L , LIU J Y ,et al. Tools and benchmarks for automated log parsing[C]// International Conference on Software Engineering. 2019: 121-130. |
[19] | Splunk[EB]. |
[20] | Logentries[EB]. |
[21] | Logz.io[EB]. |
[22] | DU M , LI F F . Spell:streaming parsing of system event logs[C]// ICDM 2016. 2016: 859-864. |
[23] | XU W , HUANG L , FOX A ,et al. Detecting large-scale system problems by mining console logs[C]// Symposium on Operating Systems Principles, 2009: 117-132. |
[24] | NAGAPPAN M , WU K , MLADEN A . Efficiently extracting operational profiles from execution logs using suffix arrays[C]// ISSRE. 2009: 41-50. |
[25] | BAO L , LI Q , LU P Y ,et al. Execution anomaly detection in large-scale systems through console log analysis[J]. Journal of Systems and Software, 2018,143: 172-186. |
[26] | LONVICK C , . The BSD syslog protocol[EB]. |
[27] | VAARANDI R , . Mining event logs with slct and loghound[C]// Proceedings of the 2008 IEEE/IFIP Network Operations and Management Symposium. 2008: 1071-1074. |
[28] | RISTO V , PIHELGAS M . LogCluster — a data clustering and pattern mining algorithm for event logs[C]// Conference on Network and Service Management (CNSM). 2015: 1-7. |
[29] | MAKANJU A , ZINCIR-HEYWOOD N , MILIOS E E . A lightweight algorithm for message type extraction in system application logs[J]. IEEE Transactions on Knowledge and Data Engineering, 2012,24(11): 1921-1936. |
[30] | TATSUAKI K , KEISUKE I , TATSUYA Mori ,et al. Spatio-temporal factorization of log data for understanding network events[C]// IEEE INFOCOM 2014. 2014: 610-618. |
[31] | FU Q , LOU J G , WANG Y ,et al. Execution anomaly detection in distributed systems through unstructured log analysis[C]// (ICDM’09)Proc of International Conference on Data Mining. 2009: 149-158. |
[32] | TANG L , LI T , PERNG C S . LogSig:generating system events from raw textual logs[C]// CIKM’11:Proc.of ACM International Conference on Information and Knowledge Management. 2011. 785-794. |
[33] | HE P J , ZHU J M , HE S L ,et al. Towards automated log parsing for large-scale log data analysis[J]. IEEE Transactions on Dependable and Secure Computing, 2018,15(6): 931-944. |
[34] | STUDIAWAN H , SOHEL F , PAYNE C . Automatic event log abstraction to support forensic investigation[C]// ACSW 2020. 2020: 1-9. |
[35] | STUDIAWAN H , PAYNE C , SOHEL F . Automatic graph-based clustering for security logs[C]// Advanced Information Networking and Applications(AINA). 2019: 914-926. |
[36] | DAI H , LI H , CHEN C S ,et al. Logram:efficient log parsing using n-gram dictionaries[R]. 2020. |
[37] | NICOLAS A , YOHAN P , SOPHIE C ,et al. Improving performances of log mining for anomaly prediction through NLP-based log parsing[C]// Modeling Analysis And Simulation on Computer and Telecommunication Systems. 2018: 237-243. |
[38] | Li G F , ZHU P J , CAO N ,et al. Improving the system log analysis with language model and semi-supervised classifier[J]. Multimedia Tools and Applications, 2019,78(15): 21521-21535. |
[39] | PI A D , CHEN W , ZELLER W ,et al. It can understand the logs,literally[C]// International Parallel and Distributed Processing Symposium. 2019: 446-451. |
[40] | LIU W Y , LIU X , DI X Q ,et al. FastlogSim:a quick log pattern parser scheme based on text similarity[C]// Knowledge Science Engineering and Management. 2020: 211-219. |
[41] | DU M , LI F F . Spell:online streaming parsing of large unstructured system logs[J]. IEEE Transactions on Knowledge and Data Engineering, 2019,31(11): 2213-2227. |
[42] | MESSAOUDI S , PANICHELLA A , BIANCULLI D ,et al. A search-based approach for accurate identification of log message formats[C]// ICPC. 2018: 167-177. |
[43] | HE P J , ZHU J M , ZHENG Z B ,et al. Drain:an online log parsing approach with fixed depth tree[C]// ICWS. 2017: 33-40. |
[44] | BAO L F , BUSANY N , LO D ,et al. Statistical log differencing[C]// Automated Software Engineering. 2019: 851-862. |
[45] | SIDDHARTHA S , SUPRATIM D , SRIKANT R ,et al. Learning latent events from network message logs[J]. IEEE ACM Transactions on Networking, 2019,27(4): 1728-1741. |
[46] | XIE X S , WANG Z , XIAO X H ,et al. A confidence-guided evaluation for log parsers inner quality[J]. Mobile Networks and Applications, 2020: 1-12. |
[47] | ZHANG D X , ZHENG Y , WEN Y ,et al. Role-based log analysis applying deep learning for insider threat detection[C]// SecArch'18. 2018: 18-20. |
[48] | EL-MASRIA D , PETRILLOB F , YANN-GA?L G ,et al. A systematic literature review on automated log abstraction techniques[J]. Information & Software Technology, 2020,22: 1-18. |
[49] | OPREA A , LI Z , YEN T F ,et al. Detection of early-stage enterprise infection by mining large-scale log data[C]// Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). 2015: 45-56. |
[50] | MARCELLO C , DOMENICO C , ANTONIO P . Event logs for the analysis of software failures:a rule-based approach[J]. IEEE Transactions on Software Engineering, 2013,39(6): 806-821. |
[51] | LOU J G , FU Q , YANG S G ,et al. Mining program workflow from interleaved traces[C]// Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2010: 613-622. |
[52] | BEZERRA F , WAINER J . Algorithms for anomaly detection of traces in logs of process aware information systems[J]. Information Systems, 2013,38(1): 33-44. |
[53] | JIA T , CHEN G , YANG L ,et al. An approach for anomaly diagnosis based on hybrid graph model with logs for distributed services[C]// Proceedings of the IEEE International Conference on Web Services (ICWS). 2017: 25-32. |
[54] | LI T , MA J F , PEI Q Q ,et al. AClog:attack chain construction based on log correlation[C]// Global Communications Conference. 2019: 1-6. |
[55] | MENG W B , LIU Y , ZHANG S L ,et al. Device-agnostic log anomaly classification with partial labels[C]// International Workshop on Quality of Service. 2018: 1-6. |
[56] | LIU Z L , QIN T , GUAN X H ,et al. An integrated method for anomaly detection from massive system logs[J]. IEEE Access, 2018: 30602-30611. |
[57] | XU W , HUANG L , ATREJA S ,et al. Online system problem detection by mining patterns of console logs[C]// ICDM’09. 2009: 588-597. |
[58] | NANDI A , MANDAL A , ATREJA S ,et al. Anomaly detection using program control flow graph mining from execution logs[C]// KDD 2016. 2016: 215-224. |
[59] | LIN Q W , ZHANG H Y , LOU J G ,et al. Log clustering based problem identification for online service systems[C]// ICSE 2016. 2016: 1-10. |
[60] | LIU F C , WEN Y , ZHANG D X ,et al. Log2vec:a heterogeneous graph embedding based approach for detecting cyber threats within enterprise[C]// CCS’19. 2019: 1777-1794. |
[1] | Ruiqi XIA, Manman LI, Shaozhen CHEN. Identification on the structures of block ciphers using machine learning [J]. Chinese Journal of Network and Information Security, 2023, 9(3): 79-89. |
[2] | Yihuai CAO, Wei CHEN, Fan ZHANG, Lifa WU. Encrypted and obfuscation WebShell detection for high-speed network traffic [J]. Chinese Journal of Network and Information Security, 2022, 8(4): 119-130. |
[3] | Nan WEI, Lihua YIN, Hong NING, Binxing FANG. Preliminary study on the reform of machine learning teaching [J]. Chinese Journal of Network and Information Security, 2022, 8(4): 182-189. |
[4] | Cheng HUANG, Mingxu SUN, Renyu DUAN, Susheng WU, Bin CHEN. Vulnerability identification technology research based on project version difference [J]. Chinese Journal of Network and Information Security, 2022, 8(1): 52-62. |
[5] | Xi FU,Hui LI,Xingwen ZHAO. Survey on phishing detection research [J]. Chinese Journal of Network and Information Security, 2020, 6(5): 1-10. |
[6] | Kang HE,Yuefei ZHU,Long LIU,Bin LU,Bin LIU. Improve the robustness of algorithm under adversarial environment by moving target defense [J]. Chinese Journal of Network and Information Security, 2020, 6(4): 67-76. |
[7] | Fuxiang YUAN,Fenlin LIU,Chong LIU,Yan LIU,Xiangyang LUO. MLAR:large-scale network alias resolution for IP geolocation [J]. Chinese Journal of Network and Information Security, 2020, 6(4): 77-94. |
[8] | Ziming LUO,Shubin XU,Xiaodong LIU. Scheme for identifying malware traffic with TLS data based on machine learning [J]. Chinese Journal of Network and Information Security, 2020, 6(1): 77-83. |
[9] | Wei HUANG,Cuncai LIU,Sibo QI. LSTM network traffic prediction and link congestion warning scheme for single port and single link [J]. Chinese Journal of Network and Information Security, 2019, 5(6): 50-57. |
[10] | Lei SONG, Chunguang MA, Guanghan DUAN. Machine learning security and privacy:a survey [J]. Chinese Journal of Network and Information Security, 2018, 4(8): 1-11. |
[11] | Tuosiyu MING, Hongchang CHEN. Research progress and trend of text summarization [J]. Chinese Journal of Network and Information Security, 2018, 4(6): 1-10. |
[12] | Zheng-qi WANG,Xiao-bing FENG,Chi ZHANG. Study of high-speed malicious Web page detection system based on two-step classifier [J]. Chinese Journal of Network and Information Security, 2017, 3(8): 44-60. |
[13] | Xi ZHANG,Zhi-wei YAN,Hong-tao LI,Guang-gang GENG. Research of phishing detection technology [J]. Chinese Journal of Network and Information Security, 2017, 3(7): 7-24. |
[14] | Dong ZHANG,Yao ZHANG,Gang LIU,Gui-xiang SONG. Research on host malcode detection using machine learning [J]. Chinese Journal of Network and Information Security, 2017, 3(7): 25-32. |
[15] | Bo-wen SUN,Yan-yi HUANG,Qiao-kun WEN,Bin TIAN,Peng WU,Qi LI. Malware classification method based on static multiple-feature fusion [J]. Chinese Journal of Network and Information Security, 2017, 3(11): 68-76. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
|