网络与信息安全学报 ›› 2024, Vol. 10 ›› Issue (4): 98-108.doi: 10.11959/j.issn.2096-109x.2024056

• 学术论文 • 上一篇    下一篇

基于滑动窗口和随机性特征的加密流量识别方案

刘家池1, 况博裕1, 苏铓2, 许亚倩3(), 付安民1,2   

  1. 1.南京理工大学网络空间安全学院, 江苏 南京 210094
    2.南京理工大学计算机科学与工程学院, 江苏 南京 210094
    3.中国电子信息产业发展研究院, 北京 100000
  • 收稿日期:2024-03-22 修回日期:2024-07-09 出版日期:2024-08-25 发布日期:2024-09-14
  • 通讯作者: 许亚倩 E-mail:xuyaqian@ccidthinktank.com
  • 作者简介:刘家池(2000-),男,河南安阳人,南京理工大学硕士生,主要研究方向为加密流量分析。
    况博裕(1994-),男,四川绵阳人,博士,主要研究方向为物联网安全、车联网安全。
    苏铓(1987-),女,内蒙古翁牛特旗人,南京理工大学副教授,主要研究方向为云计算安全、隐私保护。
    许亚倩(1985-),女,山东日照人,博士,中国电子信息产业发展研究院副研究员,主要研究方向为ICT供应链安全、工业互联网安全等。
    付安民(1981-),男,湖北通城人,南京理工大学教授、博士生导师,主要研究方向为网络与系统安全、数据安全与隐私保护。
  • 基金资助:
    国家自然科学基金项目(62072239);江苏省青蓝工程;江苏省卓越博士后计划

Encrypted traffic identification scheme based on sliding window and randomness features

Jiachi LIU1, Boyu KUANG1, Mang SU2, Yaqian XU3(), Anmin FU1,2   

  1. 1.School of Cyber Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
    2.School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
    3.China Center for Information Industry Development, Beijing 100000, China
  • Received:2024-03-22 Revised:2024-07-09 Online:2024-08-25 Published:2024-09-14
  • Contact: Yaqian XU E-mail:xuyaqian@ccidthinktank.com
  • Supported by:
    The National Natural Science Foundation of China(62072239);Qing Lan Project of Jiangsu Province, Jiangsu Funding Program for Excellent Postdoctoral Talent

摘要:

随着信息技术的发展,用户和组织对网络安全的关注度不断提高,数据加密传输逐渐成为主流,推动互联网中加密流量的比例不断攀升。然而,数据加密在保障隐私和安全的同时也成为非法内容逃避网络监管的手段。为实现加密流量的检测与分析,需要高效地识别出加密流量。但是,压缩流量的存在会严重干扰对加密流量的识别。针对上述问题,设计了基于滑动窗口和随机性特征的加密流量识别方案,以高效且准确地识别加密流量。具体来说,所提方案根据滑动窗口机制对会话中数据传输报文的有效载荷进行采样,获取能够反映原始流量信息模式的数据块序列,针对每个数据块使用随机性测度算法进行样本特征提取,为原始载荷构建随机性特征。此外,通过设计基于CART(classification and reqression tree)算法的决策树模型,在提高加密和压缩流量识别的准确率的同时,极大降低了针对加密流量识别的漏报率。基于对多个权威网站数据的随机抽样,构建均衡的数据集,并通过实验证明了所提方案的可行性和高效性。

关键词: 加密流量, 压缩流量, 随机性特征, 滑动采样

Abstract:

With the development of information technology, network security has increasingly become a focal point for users and organizations, and encrypted data transmission has gradually become mainstream. This trend has driven the proportion of encrypted traffic on the Internet to rise continuously. However, data encryption, while ensuring privacy and security, has also become a means for illegal content to evade network supervision. To achieve the detection and analysis of encrypted traffic, it has become necessary to efficiently identify encrypted traffic. However, the presence of compressed traffic has significantly interfered with the identification of encrypted traffic. To address this issue, an encrypted traffic identification scheme based on sliding windows and randomness features was designed to efficiently and accurately identify encrypted traffic. Specifically, the scheme involved sampling the payloads of data packets in sessions using a sliding window mechanism to obtain data block sequences that reflect the information patterns of the original traffic. For each data block, randomness measurement algorithms were utilized to extract sample features and construct randomness features for the original payload. Additionally, a decision tree model based on the CART algorithm was designed, which significantly improved the accuracy of identifying encrypted and compressed traffic and greatly reduced the false negative rate for encrypted traffic identification. A balanced dataset was constructed by randomly sampling data from several authoritative websites, and experiments demonstrated the feasibility and efficiency of the proposed scheme.

Key words: encrypted traffic, compressed traffic, random feature, sliding sampling

中图分类号: 

No Suggested Reading articles found!