网络与信息安全学报 ›› 2023, Vol. 9 ›› Issue (2): 46-55.doi: 10.11959/j.issn.2096-109x.2023020

• 学术论文 • 上一篇    下一篇

基于轻量级梯度提升机优化的工业互联网入侵检测方法

胡向东1,2, 唐玲玲1   

  1. 1 重庆邮电大学自动化学院/工业互联网学院,重庆 400065
    2 重庆邮电大学现代邮政学院,重庆 400065
  • 修回日期:2023-02-16 出版日期:2023-04-25 发布日期:2023-04-01
  • 作者简介:胡向东(1971- ),男,四川广安人,重庆邮电大学教授,主要研究方向为智能感知、网络化测量与工业互联网安全,物联网安全智能理论与技术
    唐玲玲(1996- ),女,重庆武隆人,重庆邮电大学硕士生,主要研究方向为工业互联网安全
  • 基金资助:
    教育部-中国移动科研基金(MCM20180404)

Method on intrusion detection for industrial internet based on light gradient boosting machine

Xiangdong HU1,2, Lingling TANG1   

  1. 1 College of Automation/ Institute of Industrial Internet, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
    2 College of Modern Posts, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • Revised:2023-02-16 Online:2023-04-25 Published:2023-04-01
  • Supported by:
    The Joint Research Foundation of the Ministry of Education of the People’s Republic of China and China Mobile(MCM20180404)

摘要:

入侵检测作为一种积极主动的安全防护技术,对于确保工业互联网安全起着至关重要的作用。为了满足工业互联网高准确率和高实时性的入侵检测需求,提出基于轻量级梯度提升机优化的工业互联网入侵检测方法。针对工业互联网业务数据中难分类样本导致检测准确率低的问题,改进轻量级梯度提升机原有的损失函数为焦点损失函数,该损失函数可自适应动态调节不同类别数据样本的损失值和权重,支持模型在训练过程中降低易分类样本的权重,进而提高难分类样本的检测准确率;针对轻量级梯度提升机参数较多并且对模型的检测准确率、检测时间和拟合程度等影响较大的问题,利用果蝇优化算法选择模型的最优参数组合;在密西西比州立大学提供的天然气管道数据集上得到模型的最优参数组合并进行验证,并在储水罐数据集上进一步验证所提模型的有效性。实验结果表明,采用所提方法改进的模型在天然气管道数据集上的检测准确率较对比模型最少提高了3.14%,检测时间较对比模型中的随机森林和支持向量机分别降低了0.35 s和19.53 s,较决策树和极端梯度提升机分别增加了0.06 s和0.02 s,同时在储水罐数据集上取得了良好的检测结果。因此证明所提方法可以很好地识别工业互联网业务数据中的攻击数据样本,提升了在工业互联网入侵检测中的实用性。

关键词: 工业互联网, 入侵检测, 轻量级梯度提升机, 焦点损失函数, 果蝇优化算法

Abstract:

Intrusion detection is a critical security protection technology in the industrial internet, and it plays a vital role in ensuring the security of the system.In order to meet the requirements of high accuracy and high real-time intrusion detection in industrial internet, an industrial internet intrusion detection method based on light gradient boosting machine optimization was proposed.To address the problem of low detection accuracy caused by difficult-to-classify samples in industrial internet business data, the original loss function of the light gradient boosting machine as a focal loss function was improved.This function can dynamically adjust the loss value and weight of different types of data samples during the training process, reducing the weight of easy-to-classify samples to improve detection accuracy for difficult-to-classify samples.Then a fruit fly optimization algorithm was used to select the optimal parameter combination of the model for the problem that the light gradient boosting machine has many parameters and has great influence on the detection accuracy, detection time and fitting degree of the model.Finally, the optimal parameter combination of the model was obtained and verified on the gas pipeline dataset provided by Mississippi State University, then the effectiveness of the proposed mode was further verified on the water dataset.The experimental results show that the proposed method achieves higher detection accuracy and lower detection time than the comparison model.The detection accuracy of the proposed method on the gas pipeline dataset is at least 3.14% higher than that of the comparison model.The detection time is 0.35s and 19.53s lower than that of the random forest and support vector machine in the comparison model, and 0.06s and 0.02s higher than that of the decision tree and extreme gradient boosting machine, respectively.The proposed method also achieved good detection results on the water dataset.Therefore, the proposed method can effectively identify attack data samples in industrial internet business data and improve the practicality and efficiency of intrusion detection in the industrial internet.

Key words: industrial Internet, intrusion detection, light gradient boosting machine, focal loss, fruit fly optimization algorithm

中图分类号: 

No Suggested Reading articles found!