通信学报 ›› 2020, Vol. 41 ›› Issue (10): 80-91.doi: 10.11959/j.issn.1000-436x.2020174

• 学术论文 • 上一篇    下一篇

融合对抗主动学习的网络安全知识三元组抽取

李涛,郭渊博,琚安康   

  1. 信息工程大学密码工程学院,河南 郑州 450001
  • 修回日期:2020-07-23 出版日期:2020-10-25 发布日期:2020-11-05
  • 作者简介:李涛(1992- ),男,甘肃甘谷人,信息工程大学博士生,主要研究方向为网络威胁语义建模|郭渊博(1975- ),男,陕西周至人,博士,信息工程大学教授、博士生导师,主要研究方向为大数据安全、态势感知|琚安康(1995- ),男,河南辉县人,信息工程大学博士生,主要研究方向为多步攻击检测、异构安全数据融合
  • 基金资助:
    国家自然科学基金资助项目(61501515)

Knowledge triple extraction in cybersecurity with adversarial active learning

Tao LI,Yuanbo GUO,Ankang JU   

  1. Department of Cryptogram Engineering,Information Engineering University,Zhengzhou 450001,China
  • Revised:2020-07-23 Online:2020-10-25 Published:2020-11-05
  • Supported by:
    The National Natural Science Foundation of China(61501515)

摘要:

针对当前网络安全领域知识获取中所依赖的流水线模式存在实体识别错误的传播,未考虑实体识别与关系抽取任务间的联系,以及模型训练缺乏标签语料的问题,提出一种融合对抗主动学习的端到端网络安全知识三元组抽取方法。首先,将实体识别与关系抽取通过联合标注策略建模为序列标注任务;然后,设计融合动态注意力机制的BiLSTM-LSTM模型实现实体与关系的联合抽取,并形成三元组;最后,基于对抗网络训练一个判别器模型,增量地筛选出高质量的待标注数据进行标注,并通过迭代训练不断提升联合抽取模型的性能。通过实验表明,所提方案中实体-关系联合抽取模型优于现有的网络安全知识抽取方案,并验证了对抗主动学习方法的有效性。

关键词: 知识三元组, 网络安全, 联合抽取, 对抗网络, 主动学习

Abstract:

Aiming at the problem that using pipeline methods for extracting cybersecurity knowledge triples may cause the errors propagation of entity recognition and did not consider the correlation between entity recognition and relation extraction,and training triple extraction model lacked labeled corpora,an end-to-end cybersecurity knowledge triple extraction method with adversarial active learning was proposed.For knowledge triple extraction,the conventional entity recognition and relation extraction were modelled as sequence labeling task through joint labeling strategy firstly.And then,a BiLSTM-LSTM-based model with dynamic attention mechanism was designed to jointly extract entities and relations,forming triples.Finally,with adversarial learning framework,a discriminator was trained to incrementally select high-quality samples for labeling,and the performance of the joint extraction model was continuously enhanced by iterative retraining.Experiments show that the proposed joint extraction model outperforms the existing cybersecurity knowledge triple extraction methods,and demonstrate the effectiveness of proposed adversarial active learning scheme.

Key words: knowledge triple, cybersecurity, joint extraction, adversarial network, active learning

中图分类号: 

No Suggested Reading articles found!