Journal on Communications ›› 2022, Vol. 43 ›› Issue (6): 58-70.doi: 10.11959/j.issn.1000-436x.2022116
• Papers • Previous Articles Next Articles
Xiuzhang YANG1,2, Guojun PENG1,2, Zichuan LI1,2, Yangqi LYU1,2, Side LIU1,2, Chenguang LI1,2
Revised:
2022-05-18
Online:
2022-06-01
Published:
2022-06-01
Supported by:
CLC Number:
Xiuzhang YANG, Guojun PENG, Zichuan LI, Yangqi LYU, Side LIU, Chenguang LI. Research on entity recognition and alignment of APT attack based on Bert and BiLSTM-CRF[J]. Journal on Communications, 2022, 43(6): 58-70.
"
标志符号 | 实体类别 | 类别定义 | 示例 |
AG | APT组织 | 常见APT攻击的团队名称 | Lazarus,APT28,OceanLotus |
AEQ | 攻击装备 | APT组织的装备 | CobaltStrike,Metasploit,Gh0st |
AM | 攻击手法 | APT组织的攻击手段和技术 | SQL injection,spear phishing,XSS attack |
AV | 攻击漏洞 | APT组织常用的漏洞,主要包括CVE编号标识或特定漏洞名称 | CVE-2017-11882,CVE-2018-4878, EternalBlue,HeartBleed |
AE | 攻击事件 | APT组织近年来开展的攻击活动 | Operation Blockbuster,DarkSeoul |
AT | 攻击目标 | APT组织攻击的公司、部门和单位 | Sony,Iranian nuclear power plant |
AI | 攻击行业 | APT组织攻击的行业信息 | financial,economic,trade policy |
MF | 恶意文件 | APT 组织常用恶意文件、敏感目录及恶意指令,文件格式包括exe、xls、doc | cmdl32.exe,Agent.btz,wwlib.dll |
MFA | 恶意软件家族 | APT组织常用的恶意软件家族 | Trojan/Win32.Occamy,ZeuS |
RL | 区域位置 | APT组织所在区域及目标区域 | North Korea,Russia,South Asia |
OS | 操作系统 | 发起APT攻击的操作系统环境 | Windows,Mac,Linux,Android |
SI | 利用软件 | 发起APT攻击的软件环境 | Chrome,Office,Firefox |
"
模型 | Precision | Recall | F1-score | Accuracy |
CRF | 0.800 2 | 0.719 0 | 0.757 4 | 0.741 3 |
LSTM-CRF | 0.877 7 | 0.762 8 | 0.816 3 | 0.816 9 |
GRU-CRF | 0.889 6 | 0.748 9 | 0.813 2 | 0.805 2 |
BiLSTM-CRF | 0.951 4 | 0.764 3 | 0.847 6 | 0.846 6 |
CNN-CRF | 0.853 6 | 0.810 4 | 0.831 4 | 0.826 4 |
Bert-CRF | 0.923 6 | 0.754 2 | 0.830 3 | 0.814 5 |
本文模型 | 0.929 6 | 0.873 3 | 0.900 6 | 0.900 4 |
"
实体类别 | Precision | Recall | F1-score |
APT组织 | 0.937 0 | 0.877 3 | 0.906 2 |
攻击装备 | 0.951 7 | 0.892 2 | 0.921 0 |
攻击手法 | 0.947 3 | 0.908 5 | 0.927 5 |
攻击漏洞 | 0.907 4 | 0.859 6 | 0.882 9 |
攻击事件 | 0.930 4 | 0.856 0 | 0.891 7 |
攻击目标 | 0.931 8 | 0.872 3 | 0.901 1 |
攻击行业 | 0.925 2 | 0.900 4 | 0.912 6 |
恶意文件 | 0.910 2 | 0.832 9 | 0.869 8 |
恶意软件家族 | 0.890 9 | 0.830 5 | 0.859 6 |
区域位置 | 0.947 5 | 0.874 1 | 0.909 3 |
操作系统 | 0.942 1 | 0.890 6 | 0.915 7 |
利用软件 | 0.933 8 | 0.885 3 | 0.908 9 |
平均结果 | 0.929 6 | 0.873 3 | 0.900 6 |
"
实体类别 | 被成功识别的命名实体 |
APT组织 | APT29; APT32; APT28; Turla; Sandworm; MuddyWater; OilRig; APT39; Kimsuky; FIN7; TA505 |
攻击装备 | PowerShell; Cobalt Strike; Mimikatz; LaZagne; Cannon; Dropper; Empire; NBTscan; TrickBot; FireMalv |
攻击手法 | Spearphishing; C2; Anti-censorship; Backdoor; Payload; Persistence; SQL injection; Watering Hole Attack |
攻击漏洞 | CVE-2017-11882; CVE-2017-0199; CVE-2012-0158; CVE-2019-19781; CVE-2014-4114; CVE-2018-0802 |
攻击事件 | DarkSeoul; Operation Blockbuster; Operation Flame; SolarWinds; Clinton Campaign; Stuxnet |
攻击目标 | NATO; Nuclear Facility; OPCW; Sony; ASEAN; World Health Organization; High-tech Company |
攻击行业 | Government; Espionage; Industry; Military Institutions; Financial Company; Manufacturing; Telecommunication |
恶意文件 | mshta.exe; wmiexec.vbs; rundl132.exe; backup.pst; csrss.exe; regsvr32.exe; sqlceip.exe; msfte.dll; pubprn.vbs |
恶意软件家族 | Trojan; Agent; Denes; Gh0st; Beacon; MSOffice.Alien.gen; CoreShell; Win32.Mimikatz; Win32.Cobalt |
区域位置 | America; Russia; North Korea; Iran; South Asia; Europe; U.S.; India; Germany |
操作系统 | Windows; Linux; Android; Mac OS; Unix; IOS; Kernel Operating System |
利用软件 | Office; Firefox; Word; RDP; Microsoft Exchange; Outlook; Adobe; WinRAR; PDF; Defender; Gmail |
"
实体类别 | 实体知识 |
APT组织 | APT28; Fancy Bear; Sofacy;Sednit; Strontium |
攻击装备 | PowerShell; Mimikatz; Koadic; JHUHUGIT; Dropper |
攻击手法 | Spearphishing; C2; Persistence; Script; DDoS; Backdoor |
攻击漏洞 | CVE-2015-1701; CVE-2017-0263;CVE-2017-0262 |
攻击事件 | the Hillary Clinton Campaign; VPNFilter |
攻击目标 | NATO; WADA; OSCE; OPCW; Nuclear Facility |
攻击行业 | Government; Industry; Organization; Education |
恶意文件 | rundll32.exe; explorer.exe; twain_64.dll; srhost.exe |
恶意软件家族 | ChopStick; Trojan; Win32.Dynamer; Zebrocy |
区域位置 | Russia; U.S.; Europe; India; Germany; U.K.; Israel |
操作系统 | Windows; Android |
利用软件 | Office; Microsoft Exchange; Gmail; PDF; NetBIOS; Delphi |
"
实体类别 | 实体知识 |
APT名称 | APT32; SeaLotus; OceanLotus; APT-C-00 |
攻击装备 | Cobalt Strike; PowerShell; Mimikatz; RC4; DKMC |
攻击手法 | Backdoor; C2; Scheduled task; Spearphishing; Script |
攻击漏洞 | CVE-2017-11882; CVE-2016-7255; CVE-2017-8759 |
攻击事件 | Cobalt Kitty; OceanLotus Blossoms |
攻击目标 | ASEAN; Asian Nations; the Media; Civil Society |
攻击行业 | Government; Military Institutions; Industry |
恶意文件 | pubprn.vbs; rundll32.exe; regsvr32.exe; kb-10233.exe |
恶意软件家族 | Denis; Gh0st; Trojan; Beacon; Win32.Agent |
区域位置 | Vietnam; Cambodia; Philippine; China; Laos |
操作系统 | Windows; Mac OS |
利用软件 | Office; Outlook; COM; RTF; Dropbox; Amazon S3 |
[1] | STOJANOVI? B , HOFER-SCHMITZ K , KLEB U . APT datasets and attack modeling for automated detection methods:a review[J]. Computers & Security, 2020,92: 101734. |
[2] | WANG W , ZHU M , ZENG X W ,et al. Malware traffic classification using convolutional neural network for representation learning[C]// Proceedings of 2017 International Conference on Information Networking (ICOIN). Piscataway:IEEE Press, 2017: 712-717. |
[3] | LUO Y , XIAO Y , CHENG L ,et al. Deep learning-based anomaly detection in cyber-physical systems:progress and opportunities[J]. ACM Computing Surveys, 2021,54(5): 106: 1-36. |
[4] | MILAJERDI S M , GJOMEMO R , ESHETE B ,et al. HOLMES:real-time APT detection through correlation of suspicious information flows[C]// Proceedings of 2019 IEEE Symposium on Security and Privacy. Piscataway:IEEE Press, 2019: 1137-1152. |
[5] | MARCHETTI M , PIERAZZI F , COLAJANNI M ,et al. Analysis of high volumes of network traffic for advanced persistent threat detection[J]. Computer Networks, 2016,109: 127-141. |
[6] | HAN X Y , PASQUIER T , BATES A ,et al. Unicorn:runtime provenance-based detector for advanced persistent threats[C]// Proceedings 2020 Network and Distributed System Security Symposium. Reston:Internet Society, 2020: 1-19. |
[7] | LANGNER R . Stuxnet:dissecting a cyberwarfare weapon[J]. IEEE Security & Privacy, 2011,9(3): 49-51. |
[8] | MUCKIN M , FITCH S C . A threat-driven approach to cyber security[J]. Lockheed Martin Corporation, 2015,3(1): 1-8. |
[9] | 宋文纳, 彭国军, 傅建明 ,等. 恶意代码演化与溯源技术研究[J]. 软件学报, 2019,30(8): 2229-2267. |
SONG W N , PENG G J , FU J M ,et al. Research on malicious code evolution and traceability technology[J]. Journal of Software, 2019,30(8): 2229-2267. | |
[10] | GIURA P , WANG W . A context-based detection framework for advanced persistent threats[C]// Proceedings of 2012 International Conference on Cyber Security. Piscataway:IEEE Press, 2012: 69-74. |
[11] | KIM Y H , PARK W H . A study on cyber threat prediction based on intrusion detection event for APT attack detection[J]. Multimedia Tools and Applications, 2014,71(2): 685-698. |
[12] | 付钰, 李洪成, 吴晓平 ,等. 基于大数据分析的APT攻击检测研究综述[J]. 通信学报, 2015,36(11): 1-14. |
FU Y , LI H C , WU X P ,et al. Detecting APT attacks:a survey from the perspective of big data analysis[J]. Journal on Communications, 2015,36(11): 1-14. | |
[13] | YANG H P , . Method for behavior-prediction of APT attack based on dynamic Bayesian game[C]// Proceedings of 2016 IEEE International Conference on Cloud Computing and Big Data Analysis. Piscataway:IEEE Press, 2016: 177-182. |
[14] | 张小松, 牛伟纳, 杨国武 ,等. 基于树型结构的APT攻击预测方法[J]. 电子科技大学学报, 2016,45(4): 582-588. |
ZHANG X S , NIU W N , YANG G W ,et al. Method for APT prediction based on tree structure[J]. Journal of University of Electronic Science and Technology of China, 2016,45(4): 582-588. | |
[15] | MILAJERDI S M , ESHETE B , GJOMEMO R ,et al. POIROT:aligning attack behavior with kernel audit records for cyber threat hunting[C]// Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. New York:ACM Press, 2019: 1813-1830. |
[16] | HUMPHREYS K , GAIZAUSKAS R , AZZAM S ,et al. University of sheffield:description of the LaSIE-II system as used for MUC-7[C]// Proceedings of the Seventh Message Understanding Conferences. Stroudsburg:ACL Press, 1998: 1-20. |
[17] | BLACK W J , RINALDI F R , MOWATT D . Facile:description of the NE system used for MUC-7[C]// Proceedings of the Seventh Message Understanding Conference. Stroudsburg:ACL Press, 1998: 1-10. |
[18] | COLLINS M , SINGER Y . Unsupervised models for named entity classification[C]// Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. Stroudsburg:ACL Press, 1999: 100-110. |
[19] | FREITAG D , MCCALLUM A . Information extraction with HMMs and shrinkage[C]// Proceedings of the AAAI-99 Workshop on Machine Learning for Information Extraction. Palo Alto:AAAI Press, 1999: 31-36. |
[20] | CHIEU H L , NG H T . Named entity recognition:a maximum entropy approach using global information[C]// Proceedings of the 19th International Conference on Computational Linguistics. Stroudsburg:ACL Press, 2002: 1-7. |
[21] | LI Y Y , BONTCHEVA K , CUNNINGHAM H . SVM based learning system for information extraction[C]// International Workshop on Deterministic and Statistical Methods in Machine Learning. Berlin:Springer, 2005: 319-339. |
[22] | MCCALLUM A , LI W . Early results for named entity recognition with conditional random fields,feature induction and web-enhanced lexicons[C]// Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL. Stroudsburg:ACL Press, 2003: 188-191. |
[23] | HAMMERTON J , . Named entity recognition with long short-term memory[C]// Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL. Stroudsburg:ACL Press, 2003: 172-175. |
[24] | STRUBELL E , VERGA P , BELANGER D ,et al. Fast and accurate entity recognition with iterated dilated convolutions[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg:ACL Press, 2017: 2670-2680. |
[25] | ZHANG Y , YANG J . Chinese NER using lattice LSTM[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg:ACL Press, 2018: 1554-1564. |
[26] | 张若彬, 刘嘉勇, 何祥 . 基于BLSTM-CRF模型的安全漏洞领域命名实体识别[J]. 四川大学学报(自然科学版), 2019,56(3): 469-475. |
ZHANG R B , LIU J Y , HE X . Named entity recognition for vulnerabilities based on BLSTM-CRF model[J]. Journal of Sichuan University (Natural Science Edition), 2019,56(3): 469-475. | |
[27] | DEVLIN J , CHANG M W , LEE K ,et al. BERT:pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg:ACL Press, 2019. 4171-4186. |
[1] | Dongyu CHEN, Hua CHEN, Limin FAN, Yifang FU, Jian WANG. Research on test strategy for randomness based on deep learning [J]. Journal on Communications, 2023, 44(6): 23-33. |
[2] | Rongpeng LI, Bingyan WANG, Honggang ZHANG, Zhifeng ZHAO. Design of knowledge enhanced semantic communication receiver [J]. Journal on Communications, 2023, 44(6): 70-76. |
[3] | Shuai MA, Ke PEI, Huayan QI, Hang LI, Wen CAO, Hongmei WANG, Hailiang XIONG, Shiyin LI. Research on geomagnetic indoor high-precision positioning algorithm based on generative model [J]. Journal on Communications, 2023, 44(6): 211-222. |
[4] | Zexi XU, Lei ZHUANG, Kunli ZHANG, Mingyu GUI. Online placement algorithm of service function chain based on knowledge graph [J]. Journal on Communications, 2022, 43(8): 41-51. |
[5] | Jie YANG, Biao DONG, Xue FU, Yu WANG, Guan GUI. Lightweight decentralized learning-based automatic modulation classification method [J]. Journal on Communications, 2022, 43(7): 134-142. |
[6] | Tao LENG, Lijun CAI, Aimin YU, Ziyuan ZHU, Jian’gang MA, Chaofei LI, Ruicheng NIU, Dan MENG. Review of threat discovery and forensic analysis based on system provenance graph [J]. Journal on Communications, 2022, 43(7): 172-188. |
[7] | Yong LIAO, Shiyi WANG. CSI feedback algorithm based on RM-Net for massive MIMO systems in high-speed mobile environment [J]. Journal on Communications, 2022, 43(5): 166-176. |
[8] | Yurong LIAO, Haining WANG, Cunbao LIN, Yang LI, Yuqiang FANG, Shuyan NI. Research progress of deep learning-based object detection of optical remote sensing image [J]. Journal on Communications, 2022, 43(5): 190-203. |
[9] | Zenghua ZHAO, Yuefan TONG, Jiayang CUI. Device-independent Wi-Fi fingerprinting indoor localization model based on domain adaptation [J]. Journal on Communications, 2022, 43(4): 143-153. |
[10] | Yong LIAO, Gang CHENG, Yujie LI. CSI feedback algorithm based on deep unfolding for massive MIMO systems [J]. Journal on Communications, 2022, 43(12): 77-88. |
[11] | Xueyuan DUAN, Yu FU, Kun WANG, Bin LI. LDoS attack detection method based on simple statistical features [J]. Journal on Communications, 2022, 43(11): 53-64. |
[12] | Junyan HUO, Ruipeng QIU, Yanzhuo MA, Fuzheng YANG. Reference frame list optimization algorithm in video coding by quality enhancement of the nearest picture [J]. Journal on Communications, 2022, 43(11): 136-147. |
[13] | Haiyan KANG, Yuanrui JI. Research on federated learning approach based on local differential privacy [J]. Journal on Communications, 2022, 43(10): 94-105. |
[14] | Hongxia ZHANG, Qi WANG, Dengyue WANG, Ben WANG. Honeypot contract detection of blockchain based on deep learning [J]. Journal on Communications, 2022, 43(1): 194-202. |
[15] | Yan YAN, Yiming CONG, Mahmood Adnan, Quanzheng SHENG. Statistics release and privacy protection method of location big data based on deep learning [J]. Journal on Communications, 2022, 43(1): 203-216. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
|