Journal on Communications ›› 2021, Vol. 42 ›› Issue (10): 162-172.doi: 10.11959/j.issn.1000-436x.2021181

• Papers • Previous Articles     Next Articles

Malicious domain name detection method based on associated information extraction

Bin ZHANG1,2, Renjie LIAO1,2   

  1. 1 Department of Cryptogram Engineering, Information Engineering University, Zhengzhou 450001, China
    2 He’nan Province Key Laboratory of Information Security, Zhengzhou 450001, China
  • Revised:2021-07-01 Online:2021-10-25 Published:2021-10-01
  • Supported by:
    The Open Fund Project of Information Assurance Technology Key Laboratory(KJ-15-109);The New Re-search Direction Cultivation Fund of Information Engineering University(2016604703);The Research Project of Information Engineering University(2019f3303)

Abstract:

To improve the accuracy of malicious domain name detection based on the associated information, a detection method combining resolution information and query time was proposed.Firstly, the resolution information was mapped to nodes and edges in a heterogeneous information network, which improved the utilization rate.Secondly, considering the problem of high computational complexity in extracting associated information with matrix multiplication, an efficiency breadth-first network traversal algorithm based on meta-path was proposed.Then, the query time was used to detect the domain names lacking meta-path information, which improved the coverage rate.Finally, domain names were vectorized by representation learning with adaptive weight.The Euclidean distance between domain name feature vectors was used to quantify the correlation between domain names.Based on the vectors learned above, a supervised classifier was constructed to detect malicious domain names.Theoretical analysis and experimental results show that the proposed method preforms well in extraction domain name associated information.The coverage rate and F1 score are 97.7% and 0.951 respectively.

Key words: malicious domain name detection, heterogeneous information network, domain name resolution information, query time, representation learning

CLC Number: 

No Suggested Reading articles found!