Journal on Communications ›› 2016, Vol. 37 ›› Issue (Z1): 116-124.doi: 10.11959/j.issn.1000-436x.2016257

• Contents Papers • Previous Articles     Next Articles

Phishing attacks discovery based on HTML layout similarity

Xue-qiang ZOU1,2,Peng ZHANG1,Cai-yun HUANG,Zhi-peng CHEN1,Yong SUN1,Qing-yun LIU1   

  1. 1 Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100093,China
    2 National Computer Network Emergency Response and Coordination Center,Beijing 100029,China
  • Online:2016-10-25 Published:2017-01-17
  • Supported by:
    The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National High Technology Research and Development Program of China (863 Program)

Abstract:

Based on the similarity of the layout structure between the phishing sites and real sites,an approach to discover phishing sites was presented.First,the tag with link attribute as a feature was extracted,and then based on the feature,the page tag sequence branch to identify website was extracted,followed by the page layout similarity-HTMLTagAntiPhish,the alignment of page tag sequence tree into the alignment of page tag sequence branches was converted,this converted two-dimention tree structure into one-dimention string structure,and finally through the substitution matrix of bioinfor-matics BLOSUM62 coding,alignment score quickly to improve the phishing sites detection efficiency was computed.A series of simulation experiments show that this approach is feasible and has higher precision and recall rates.

Key words: layout similarity, phishing attack, tag sequence tree

No Suggested Reading articles found!