通信学报 ›› 2019, Vol. 40 ›› Issue (8): 133-142.doi: 10.11959/j.issn.1000-436x.2019132

• 学术论文 • 上一篇    下一篇

基于多粒度级联孤立森林算法的异常检测模型

杨晓晖,张圣昌   

  1. 河北大学网络空间安全与计算机学院,河北 保定 071002
  • 修回日期:2019-05-03 出版日期:2019-08-25 发布日期:2019-08-30
  • 作者简介:杨晓晖(1975- ),男,河北巨鹿人,博士,河北大学教授、硕士生导师,主要研究方向为分布计算、信息安全与可信计算。|张圣昌(1993- ),男,河北邯郸人,河北大学硕士生,主要研究方向为分布式计算与信息安全。
  • 基金资助:
    国家重点研发计划基金资助项目(2017YFB0802300)

Anomaly detection model based on multi-grained cascade isolation forest algorithm

Xiaohui YANG,Shengchang ZHANG   

  1. School of Cyber Security and Computer,Hebei University,Baoding 071002,China
  • Revised:2019-05-03 Online:2019-08-25 Published:2019-08-30
  • Supported by:
    The National Key Research and Development Program of China(2017YFB0802300)

摘要:

孤立森林算法是基于隔离机制的异常检测算法,存在与轴平行的局部异常点无法检测、对高维数据异常点缺乏敏感性和稳定性等问题。针对这些问题,提出了基于随机超平面的隔离机制和多粒度扫描机制,随机超平面使用多个维度的线性组合简化数据模型的隔离边界,利用随机线性分类器的隔离边界能够检测更复杂的数据模式。同时,多粒度扫描机制利用滑动窗口的方式进行维度子采样,每一个维度子集均训练一个森林,多个森林集成投票决策,构造层次化集成学习异常检测模型。实验表明,改进的孤立森林算法对复杂异常数据模式有更好的稳健性,层次化集成学习模型提高了高维数据中异常检测的准确性和稳定性。

关键词: 异常检测, 孤立森林, 隔离机制, 多粒度扫描, 随机超平面

Abstract:

The isolation-based anomaly detector,isolation forest has two weaknesses,its inability to detect anomalies that were masked by axis-parallel clusters,and anomalies in high-dimensional data.An isolation mechanism based on random hyperplane and a multi-grained scanning was proposed to overcome these weaknesses.The random hyperplane generated by a linear combination of multiple dimensions was used to simplify the isolation boundary of the data model which was a random linear classifier that can detect more complex data patterns,so that the isolation mechanism was more consistent with data distribution characteristics.The multi-grained scanning was used to perform dimensional sub-sampling which trained multiple forests to generate a hierarchical ensemble anomaly detection model.Experiments show that the improved isolation forest has better robustness to different data patterns and improves the efficiency of anomaly points in high-dimensional data.

Key words: anomaly detection, isolation forest, isolation mechanism, multi-grained scanning, random hyperplane

中图分类号: 

No Suggested Reading articles found!