网络与信息安全学报 ›› 2020, Vol. 6 ›› Issue (5): 1-10.doi: 10.11959/j.issn.2096-109x.2020062
• 综述 • 下一篇
修回日期:
2020-02-16
出版日期:
2020-10-15
发布日期:
2020-10-19
作者简介:
付溪(1996- ),女,陕西西安人,西安电子科技大学硕士生,主要研究方向为信息安全|李晖(1968- ),男,河南灵宝人,博士,西安电子科技大学教授、博士生导师,主要研究方向为密码学、无线网络安全、云计算安全、信息论与编码理论|赵兴文(1977- ),男,广西玉林人,博士,西安电子科技大学副教授,主要研究方向为人工智能在网络安全中的应用、多方参与的数据安全共享、匿名认证等保护隐私的密码协议
基金资助:
Revised:
2020-02-16
Online:
2020-10-15
Published:
2020-10-19
Supported by:
摘要:
随着互联网的不断发展,网络钓鱼给人们日常生活带来的威胁与日俱增。网络钓鱼识别技术是对抗钓鱼攻击的核心安全技术,可以帮助人们有效避免钓鱼攻击引起的安全威胁。首先,从网络钓鱼的基本概念入手,详细分析了网络钓鱼识别技术的研究现状,然后,对目前网络钓鱼识别的应用场景进行了归纳和总结,最后,对今后可能的研究方向进行了讨论。
中图分类号:
付溪,李晖,赵兴文. 网络钓鱼识别研究综述[J]. 网络与信息安全学报, 2020, 6(5): 1-10.
Xi FU,Hui LI,Xingwen ZHAO. Survey on phishing detection research[J]. Chinese Journal of Network and Information Security, 2020, 6(5): 1-10.
表1
典型识别方法比较 Table 1 Comparison of typical detection methods"
典型工作 | 识别方法 | 评价指标 | ||||||
基于列表基于列表 | 启发式 | 机器学习 | 准确率 | FNR | FPR | 时间 | ||
Google API[ | √√ | — | — | — | 0.08 s | |||
PhishNet[ | √ | √ | — | 3% | 5% | 0.001 s | ||
CANTINA[ | √ | 95% | — | 6% | — | |||
Phishark[ | √ | — | — | 2.4% | 2s | |||
Website logo method[ | √ | 93.4% | 0.2% | 13% | — | |||
SHLR[ | √ | √ | 98.9% | — | — | 0.029 ms | ||
URLs based method[ | √ | 98.3% | — | 2.6% | — | |||
CGRU[ | √ | 99.6% | — | — | 640.78 s | |||
MFPD[ | √ | 98.99% | — | 0.59% | — | |||
PEDS[ | √ | 98.63% | 0.93% | 1.81% | — | |||
THEMIS[ | √ | 99.85% | 0.128% | 0.043% | — | |||
PhishAri[ | √ | 92.52% | 7.78% | 9.6% | 0.425 s | |||
Twitter phishing[ | √ | 94.75% | 5.12% | 5.3% | — | |||
HEFS[ | √ | 94.6% | — | — | 0.052 ms |
表2
基于机器学习的方案比较 Table 2 Comparison of machine learning-based solutions"
典型方法 | 方法 | 算法 | 数据集大小 | 特征 | 准确率 |
URLs based method[ | ML | DF | 6 197 | 12个URL特征 | 97.7% |
KNN | 90.3% | ||||
RF | 96.0% | ||||
LR | 95.4% | ||||
CGRU[ | DL | 卷积神经网络+门控递归单元 | 407 212 | 从原始URL得到语义特征 | 99.6% |
A stacking model[ | DL | GBDT+XGBoost+LightGBM | 49 947 | 8个URL特征+12个HTML特征 | 97.30% |
51 103 | 98.60% | ||||
MFPD[ | DL | LSTM+ XGBoost | 2 010 779 | 24个URL特征+文本特征 | 99.41% |
PEDS[ | ML | 强化学习 | 12 326 | 从电子邮件提取50个特征 | 98.6% |
THEMIS[ | DL | RCNN | 7 781 | 从电子邮件的文本内容得到语义特征 | 99.84% |
IPDPS[ | ML | ANFIS | 11 056 | 22个文本特征+5个图像特征+8个框架特征 | 98.3% |
HEFS[ | ML | RF | 10 000 | 从URL和HTML源代码提取48个特征 | 96.17% |
C4.5 | 94.37% | ||||
SVM | 92.20% | ||||
NB | 84.10% |
[1] | BLEAU H . 2017 Global fraud and cybercrime forecast[EB]. |
[2] | Phishing activity trends report 3rd quarter 2019[R]. |
[3] | 王惟 . 反钓鱼技术综述[J]. 山东广播电视大学学报, 2011(3): 45-46,49. |
WANG W . Review of anti-phishing technology[J]. Journal of Shandong Radio and TV University, 2011(3): 45-46,49. | |
[4] | 李江丰, 王玮 . 钓鱼网站的识别与分析方法研究[J]. 通信管理与技术, 2018(3): 62-64. |
LI J F , WANG W . Research on recognition and analysis methods of phishing websites[J]. Communications Management and Technol-ogy, 2018(3): 62-64 | |
[5] | 沙泓州, 刘庆云, 柳厅文 ,等. 恶意网页识别研究综述[J]. 计算机学报, 2016(3): 529-542. |
SHA H Z , LIU Q Y , LIU T W.et al . Review of malicious web rec-ognition[J]. Journal of Computers, 2016(3): 529-542. | |
[6] | Google safe browsing APIs[EB]. |
[7] | HAN W , CAO Y , BERTINO E ,et al. Using automated individual white-list to protect web digital identities[J]. Expert Systems with Applications, 2012,39(15). |
[8] | PRAKASH P , KUMAR M , KOMPELLA R R ,et al. PhishNet:predictive blacklisting to detect phishing attacks[C]// 29th INFOCOM. 2010 |
[9] | ZHANG J , PORRAS P A , ULLRICH J . Highly predictive blacklisting[C]// 17th USENIX Security Symposium 2008. |
[10] | SHENG S , WARDMAN B , WARNER G ,et al. An empirical analysis of phishing blacklists[C]// The 6th Conf.Email Anti-Spam (CEAS) |
[11] | GASTELLIER-PREVOST S , GRANADILLO G G , LAURENT M . Decisive heuristics to differentiate legitimate from phishing sites[C]// Network & Information Systems Security. 2011. |
[12] | KANG L C , CHANG E H , SZE S N ,et al. Utilisation of website logo for phishing detection[J]. Computers & Security, 2015(54): 16-26. |
[13] | DUNLOP M , GROAT S , SHELLY D . GoldPhish:using images for content-based phishing analysis[C]// Fifth International Conference on Internet Monitoring & Protection. 2010. |
[14] | HUH J H , KIM H . Phishing detection with popular search engines:simple and effective[M]// Foundations and Practice of Security. Berlin Heidelberg:Springer, 2011. |
[15] | JAIN A K , GUPTA B B . Two-level authentication approach to protect from phishing attacks in real time[J]. Ambient Intelligence and Humanized Computing, 2018,9(6): 1783-1796. |
[16] | TAN C L , KANG L C , WONG K S ,et al. PhishWHO:phishing webpage detection via identity keywords extraction and target domain name finder[J]. Decision Support Systems, 2016(88): 18-27. |
[17] | VARSHNEY G , MISRA M , ATREY P K . Improving the accuracy of search engine based anti-phishing solutions using lightweight features[C]// 11th ICITST 2016. |
[18] | ZHANG Y , HONG J I , CRANOR L F . Cantina:a content-based approach to detecting phishing web sites[C]// 16th WWW, 2007. |
[19] | RAO R S , PAIS A R . Jail-phish:an improved search engine based phishing detection system[J]. Computers & Security, 2019(83): 246-267. |
[20] | LE A , MARKOPOULOU A , FALOUTSOS M . PhishDef.URL names say it all[J]. CoRR abs/1009.2275, 2010 |
[21] | MA J , SAUL L K , SAVAGE S ,et al. Beyond blacklists:learning to detect malicious Web sites from suspicious URLs[C]// KDD. 2009. |
[22] | YUAN H , CHEN X , LI Y ,et al. Detecting phishing websites and targets based on URLs and webpage links[C]// 24th ICPR 2018. |
[23] | GARERA S , PROVOS N , CHEW M ,et al. A framework for detection and measurement of phishing attacks[C]// ACM Workshop on Recurring Malcode. 2007. |
[24] | MC-GRATH D K , GUPTA M . Behind phishing:an examination of phisher modi operandi[C]// 5th NSDI 2008. |
[25] | LI Y , YANG Z , XU C ,et al. A stacking model using URL and HTML features for phishing webpage detection[J]. Future Generation Comp Syst, 2019(94): 27-39. |
[26] | SAHINGOZ O K , BUBER E , DEMIR O ,et al. Machine learning based phishing detection from URLs[J]. Expert Syst Appl, 2019(117): 345-357. |
[27] | YANG W , ZUO W , CUI B . Detecting malicious URLs via a keyword-based convolutional gated-recurrent-unit neural network[J]. IEEE Access, 2019(7): 29891-29900. |
[28] | YANG P , ZHAO G , ZENG P . Phishing website detection based on multidimensional features driven by deep learning[J]. IEEE Access, 2019(7): 15196-15209. |
[29] | MARCHAL S , ARMANO G , GRONDAHL T ,et al. Off-the-hook:an efficient and usable client-side phishing prevention application[J]. IEEE Trans.Computers, 2017,66(10): 1717-1733. |
[30] | RAO R S , PAIS A R . Detection of phishing websites using an efficient feature-based machine learning framework[J]. Neural Computing and Applications, 2019,31(8): 3851-3873. |
[31] | SHIRAZI H , BEZAWADA B , RAY I.Unbiased phishing detection using domain name based features . [C]// SACMAT. 2018. |
[32] | MARCHAL S , SAARI K , SINGH N ,et al. Know your phish:novel techniques for detecting phishing sites and their targets[C]// 36th ICDCS. 2016. |
[33] | RAMESH G , KRISHNAMURTHI I , KUMAR K S S . An efficacious method for detecting phishing webpages through target domain identification[J]. Decision Support Systems, 2014,61: 12-22. |
[34] | XIANG G , HONG J I , ROSé C P ,et al. CANTINA+:a feature-rich machine learning framework for detecting phishing web sites[J]. ACM Trans Inf Syst Secur, 2011,14(2): 21:1-21:28. |
[35] | FATT J C S , LENG C K , NAH S S . Phishdentity:leverage website favicon to offset polymorphic phishing website[C]// ARES. 2014. |
[36] | CHIEW K L , CHOO J S F , SZE S N ,et al. Leverage website favicon to detect phishing websites[J]. Security and Communication Networks, 2018:11. |
[37] | 周诚诚, 张代远 . 利用图像识别技术过滤海量可疑钓鱼网站[J]. 计算机技术与发展, 2012(11): 246-249. |
ZHOU C C , ZHANG D Y . Using image recognition technology to filter mass suspicious phishing sites[J]. Computer Technology and Development, 2012(11): 246-249. | |
[38] | 肖洪云 . 图像识别在钓鱼检测中的应用[J]. 沧州师范学院学报, 2012(3): 75-79. |
XIAO H Y . The application of image recognition to the fishing de-tection[J]. Journal of Cangzhou Normal University, 2012(3): 75-79. | |
[39] | HARA , YAMADA , MIYAKE .Visual similarity-based phishing detection without victim site information[C]// ICICS. 2009. |
[40] | FANG Y , HUANG C , LIU L ,et al. Research on malicious JavaScript detection technology based on LSTM[J]. IEEE Access, 2018,6: 59118-59125. |
[41] | DONG Z , KAPADIA A , BLYTHE J ,et al. Beyond the lock icon:real-time detection of phishing websites using public key certificates[C]// eCrime. 2015. |
[42] | MENSAH P , BLANC G , OKADA K ,et al. AJNA:anti-phishing JS-based visual analysis,to mitigate users' excessive trust in SSL/TLS[C]// 4th BADGERS@RAID. 2015. |
[43] | DRURY V , MEYER U . Certified phishing:taking a look at public key certificates of phishing websites[C]// 15th SOUPS @ USENIX Security Symposium. 2019. |
[44] | 胡向东, 刘可, 张峰 ,等. 基于页面敏感特征的金融类钓鱼网页检测方法[J]. 网络与信息安全学报, 2017(2): 35-42. |
HU X D , LIU K , ZHANG F ,et al. Financial phishing detection method based on sensitive characteristics of webpage[J]. Chinese Journal of Network and Information Security, 2017(2): 35-42. | |
[45] | 方勇, 龙啸, 黄诚 ,等. 基于LSTM与随机森林混合构架的钓鱼网站识别研究[J]. 四川大学学报(工程科学版), 2018(5). |
FANG Y , LONG X , HUANG C ,et al. Research on classifying phishing URLS using hybrid architecture of LSTM and random forest[J]. Advanced Engineering Sciences, 2018(5). | |
[46] | CHIEW K L , TAN C L , WONG K ,et al. A new hybrid ensemble feature selection framework for machine learning-based phishing detection system[J]. Inf Sci, 2019,484: 153-166. |
[47] | BABAGOLI M , AGHABABA M P , SOLOUK V . Heuristic nonlinear regression strategy for detecting phishing websites[J]. Soft Comput, 2019,23(12): 4315-4327. |
[48] | Phishing activity trends report 4th quarter 2016[EB]. |
[49] | Phishing activity trends report 1st-3rd quarter 2015[EB]. |
[50] | Microsoft security intelligence report[R]. 2018. |
[51] | 王晓丽 . 钓鱼邮件攻击防范指南[J]. 计算机与网络, 2018,581(13): 56-57. |
WANG X L . Guidelines for preventing phishing email attacks[J].Computer & Network2018,581(13):56-57. 计算机与网络, 2018,581(13): 56-57. | |
[52] | GASCON H , ULLRICH S , STRITTER B ,et al. Reading between the lines:content-agnostic detection of spear-phishing emails[C]// 21st RAID. 2018. |
[53] | SMADI S , ASLAM N , ZHANG L . Detection of online phishing email using dynamic evolving neural network based on reinforcement learning[J]. Decision Support Systems, 2018,107: 88-102. |
[54] | FANG Y , ZHANG C , HUANG C ,et al. Phishing email detection using improved RCNN model with multilevel vectors and attention mechanism[J]. IEEE Access, 2019,7: 56329-56340. |
[55] | 高一男, 蔡满春 . 基于 Hadoop 和 Mahout 的钓鱼邮件检测技术研究[J]. 电脑知识与技术, 2016,12(11): 27-30. |
GAO Y N , CAI M C . Research of phishing-mail detection based on Hadoop and Mahout[J]. Computer Knowledge and Technology, 2016,12(11): 27-30. | |
[56] | AGGARWAL A , RAJADESINGAN A , KUMARAGURU P . PhishAri:automatic realtime phishing detection on Twitter[J]. CoRR abs/1301.6899, 2013 |
[57] | SHARMA N , SHARMA N , TIWARI V ,et al. Real-time detection of phishing Tweets[C]// The Fourth International Conference on Computer Science,Engineering and Applications. 2014. |
[58] | LIEW S W , SANI N F M , ABDULLAH M T ,et al. An effective security alert mechanism for real-time phishing tweet detection on twitter[J]. Computers & Security, 2019 |
[59] | 石春宏 . 移动终端如何识别钓鱼手段与防范[J]. 电脑知识与技术, 2017(31): 43-44. |
SHI C H . How to identify and prevent phishing on mobile termin-als[J]. Computer Knowledge and Technology, 2017(31): 43-44. | |
[60] | BICAKCI K , UNAL D , ASCIOGLU N ,et al. Mobile authentication secure against man-in-the-middle attacks[C]// 2nd Mobile Cloud 2014, |
[61] | GOEL D , JAIN A K . Mobile phishing attacks and defence mechanisms:state of art and open research challenges[J]. Computers &Security, 2018,73: 519-544. |
[62] | WU L , DU X , WU J . Effective defense schemes for phishing attacks on mobile computing platforms[J]. IEEE Trans Vehicular Technology, 2016,65(8): 6678-6691. |
[63] | VIRVILIS N , TSALIS N , MYLONAS A ,et al. Mobile devices:a phisher′s paradise[C]// 11th International Conference on Security and Cryptography. 2014. |
[64] | 段青 . 一种移动平台钓鱼攻击的解决方法[J]. 信息安全与技术, 2016,7(4): 50-54. |
DUAN Q . A solution for phishing attackon Android platform[J]. Cyberspace Security, 2016,7(4): 50-54. | |
[65] | NDIBWILE J D , KADOBAYASHI Y , FALL D . UnPhishMe:phishing attack detection by deceptive login simulation through an android mobile App[C]// 12th AsiaJCIS. 2017. |
[66] | LIU D , LIU D , LI Y ,et al. Efficient Android phishing detection based on improved na?ve bayes algorithm[C]// 10th ICSI. 2019. |
[67] | 刘永明, 杨婧 . 基于图像相似性的Android钓鱼恶意应用检测方法[J]. 计算机系统应用, 2014,23(12): 170-175. |
LIU Y M , YANG J . Detection of android phishing malwares based on image similarity[J]. Computer Systems & Applications, 2014,23(12): 170-175. | |
[68] | DING Y , LUKTARHAN N , LI K ,et al. A keyword-based combination approach for detecting phishing webpages[J]. Computers &Security, 2019,84: 256-275. |
[69] | ADEBOWALE M A , LWIN K T,SáNCHEZ E ,et al. Intelligent web-phishing detection and protection scheme using integrated features of images,frames and text[J]. Expert Syst Appl, 2019,115: 300-313. |
[70] | ABUTAIR H , BELGHITH A , ALAHMADI S . CBR-PDS:a case-based reasoning phishing detection system[J]. Journal of Ambient Intelligence and Humanized Computing, 2019,10(7): 2593-2606. |
[71] | HE M , HORNG S J , FAN P ,et al. An efficient phishing webpage detector[J]. Expert Syst Appl, 2011,38(10): 12018-12027. |
[72] | MOHAMMAD R M , THABTAH F , MCCLUSKEY L . An assessment of features related to phishing websites using an automated technique[C]// 7th ICITST. 2012. |
[73] | FALCONIERI V . Open dataset of phishing and tor hidden services screen-captures[J]. CoRR abs/1908.02449, 2019 |
[74] | 张茜, 延志伟, 李洪涛 ,等. 网络钓鱼欺诈检测技术研究[J]. 网络与信息安全学报, 2017,3(7): 11-28. |
ZHANG X , YAN Z W , LI H T.et al . Research of phishing detection technology[J]. Chinese Journal of Network and Information Secu-rity, 2017,3(7): 11-28. | |
[75] | SHIRAZI H , BEZAWADA B , RAY I ,et al. Adversarial sampling attacks against phishing detection[M]// Data and Applications Security and Privacy.Berlin:Springer. 2019. |
[1] | 夏锐琪, 李曼曼, 陈少真. 基于机器学习的分组密码结构识别[J]. 网络与信息安全学报, 2023, 9(3): 79-89. |
[2] | 代龙, 张静, 樊雪峰, 周晓谊. 基于黑盒水印的NLP神经网络版权保护[J]. 网络与信息安全学报, 2023, 9(1): 140-149. |
[3] | 易聪, 胡军. 新的基于鼠标行为的持续身份认证方法[J]. 网络与信息安全学报, 2022, 8(5): 179-188. |
[4] | 韦南, 殷丽华, 宁洪, 方滨兴. 本科“机器学习”课程教学改革初探[J]. 网络与信息安全学报, 2022, 8(4): 182-189. |
[5] | 张宇, 李炳龙, 李学娟, 张和禹. 基于DSR和BGRU模型的聊天文本证据分类方法[J]. 网络与信息安全学报, 2022, 8(2): 150-159. |
[6] | 黄诚, 孙明旭, 段仁语, 吴苏晟, 陈斌. 面向项目版本差异性的漏洞识别技术研究[J]. 网络与信息安全学报, 2022, 8(1): 52-62. |
[7] | 张颖君,刘尚奇,杨牧,张海霞,黄克振. 基于日志的异常检测技术综述[J]. 网络与信息安全学报, 2020, 6(6): 1-12. |
[8] | 超凡,杨智,杜学绘,孙彦. 基于深度神经网络的Android恶意软件检测方法[J]. 网络与信息安全学报, 2020, 6(5): 67-79. |
[9] | 杜思佳,于海宁,张宏莉. 基于深度学习的文本分类研究进展[J]. 网络与信息安全学报, 2020, 6(4): 1-13. |
[10] | 何康,祝跃飞,刘龙,芦斌,刘彬. 敌对攻击环境下基于移动目标防御的算法稳健性增强方法[J]. 网络与信息安全学报, 2020, 6(4): 67-76. |
[11] | 袁福祥,刘粉林,刘翀,刘琰,罗向阳. MLAR:面向IP定位的大规模网络别名解析[J]. 网络与信息安全学报, 2020, 6(4): 77-94. |
[12] | 骆子铭,许书彬,刘晓东. 基于机器学习的TLS恶意加密流量检测方案[J]. 网络与信息安全学报, 2020, 6(1): 77-83. |
[13] | 肖辉,翁彬,黄添强,普菡,黄则辉. 融合多特征的视频帧间篡改检测算法[J]. 网络与信息安全学报, 2020, 6(1): 84-93. |
[14] | 黄伟,刘存才,祁思博. 针对设备端口链路的LSTM网络流量预测与链路拥塞方案[J]. 网络与信息安全学报, 2019, 5(6): 50-57. |
[15] | 于游, 付钰, 吴晓平. 中文文本分类方法综述[J]. 网络与信息安全学报, 2019, 5(5): 1-8. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
|