通信学报 ›› 2021, Vol. 42 ›› Issue (6): 84-93.doi: 10.11959/j.issn.1000-436x.2021117

• 学术论文 • 上一篇    下一篇

基于有监督判别投影的网络安全数据降维算法

郭方方, 吕宏武, 任威霖, 王瑞妮   

  1. 哈尔滨工程大学计算机科学与技术学院,黑龙江 哈尔滨 150001
  • 修回日期:2021-03-01 出版日期:2021-06-25 发布日期:2021-06-01
  • 作者简介:郭方方(1973− ),男,黑龙江哈尔滨人,博士,哈尔滨工程大学副教授、硕士生导师,主要研究方向为计算机网络应用、新型网络体系结构、网络安全态势感知、云监控等
    吕宏武(1983− ),男,山东日照人,博士,哈尔滨工程大学副教授、博士生导师,主要研究方向为网络安全、移动云计算与移动边缘计算、形式化建模与性能评价等
    任威霖(1997− ),男,黑龙江哈尔滨人,哈尔滨工程大学硕士生,主要研究方向为网络数据预处理、网络态势感知预测等
    王瑞妮(1994− ),女,山西运城人,哈尔滨工程大学硕士生,主要研究方向为流形学习、网络数据异常处理等
  • 基金资助:
    国家自然科学基金资助项目(61872104);中央高校基础科研业务费专项资金资助项目(3072020CF0603)

Reduction algorithm based on supervised discriminant projection for network security data

Fangfang GUO, Hongwu LYU, Weilin REN, Ruini WANG   

  1. College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
  • Revised:2021-03-01 Online:2021-06-25 Published:2021-06-01
  • Supported by:
    The National Natural Science Foundation of China(61872104);The Fundamental Research Fund for the Central Universities(3072020CF0603)

摘要:

针对传统流形学习在数据降维时不考虑原数据类别和聚类程度低的缺陷,提出了一种有监督判别投影(SDP)的流形学习降维算法来改善网络安全数据降维效果。在近邻矩阵基础上,利用数据集的类别标签信息,构建有监督判别矩阵,变无监督流形学习为有监督学习,寻找一个同时具有最大全局散度矩阵和最小局部散度矩阵的低维投影子空间,保证了降维投影后同类数据聚集而异类数据分散的特性。实验结果显示,与传统降维算法相比,所提算法可以较低的时间复杂度去除冗余数据,并且降维后的数据聚类效果更好,异类样本更分散,适用于实际的网络安全数据分析模型。

关键词: 数据降维, 流形学习, 有监督学习, 判别投影

Abstract:

In response to the problem that for dimensionality reduction, traditional manifold learning algorithm did not consider the raw data category information, and the degree of clustering was generally at a low level, a manifold learning dimensionality reduction algorithm with supervised discriminant projection (SDP) was proposed to improve the dimensionality reduction effects of network security data.On the basis of the nearest neighbor matrix, the label information of the raw data category was exploited to construct a supervised discriminant matrix in order to translate unsupervised popular learning into supervised learning.The target was to find a low dimensional projective space with both maximum global divergence matrix and minimum local divergence matrix, ensuring that the same kind of data was concentrated and heterogeneous data was scattered after dimensionality reduction projection.The experimental results show that the SDP algorithm, compared with the traditional dimensionality reduction algorithms, can effectively remove redundant data with low time complexity.Meanwhile the data after dimensionality reduction is more concentrated, and the heterogeneous samples are more dispersed, suitable for the actual network security data analysis model.

Key words: data dimension reduction, manifold learning, supervised learning, discriminant projection

中图分类号: 

No Suggested Reading articles found!