网络与信息安全学报 ›› 2023, Vol. 9 ›› Issue (2): 81-93.doi: 10.11959/j.issn.2096-109x.2023023

• 学术论文 • 上一篇    下一篇

渐进式的协议状态机主动推断方法

潘雁, 林伟, 祝跃飞   

  1. 信息工程大学,河南 郑州 450001
  • 修回日期:2023-02-05 出版日期:2023-04-25 发布日期:2023-04-01
  • 作者简介:潘雁(1995- ),男,安徽安庆人,信息工程大学博士生,主要研究方向为网络协议分析、软件逆向
    林伟(1986- ),男,湖南常德人,信息工程大学讲师,主要研究方向为软件保护与分析、协议逆向
    祝跃飞(1962- ),男,浙江兰溪人,信息工程大学教授、博士生导师,主要研究方向为网络安全、密码学
  • 基金资助:
    国家重点研发计划(2019QY1300)

Progressive active inference method of protocol state machine

Yan PAN, Wei LIN, Yuefei ZHU   

  1. Information Engineering University, Zhengzhou 450001, China
  • Revised:2023-02-05 Online:2023-04-25 Published:2023-04-01
  • Supported by:
    The National Key R&D Program of China(2019QY1300)

摘要:

主动协议状态机推断的理论基础为主动自动机学习,所面临的核心问题是字母表的抽象与映射器的构建。同一类型消息取值的多样性可能导致同一类型的数据包存在不同的响应类型,从而导致当前使用类型作为字母表的方法会丢失状态或状态转移。对此,依据不同的响应将协议类型细化为子类型,提出一种渐进式主动推断方法。基于已有协议数据提取协议状态字段,构建初始字母表与映射器,基于主动推断方法得到初始状态机;对数据进行确定性变异,若输入输出类型序列与当前状态机不符,则将变异后数据视为协议子类型,并添加至字母表,同时依据新的字母表进行新的状态机推断。此外,为减少协议实际交互次数,依据协议特性,在主动推断算法的缓存机制基础上提出一种基于前缀匹配的预响应查询算法。实现了开源框架ProLearner,并以SMTP和RTSP为对象,通过扩展协议子类型获得了更为详细的协议行为,验证了所提方法的有效性;此外,实验结果表明预响应查询算法可有效减少实际交互的次数,平均降低的实际交互次数约为10%。

关键词: 协议逆向分析, 主动自动机学习, 协议状态机推断, Mealy自动机, 映射器

Abstract:

Protocol state machine active inference is a technique that relies on active automata learning.However, the abstraction of the alphabet and the construction of the mapper present critical challenges.Due to the diversity of messages of the same type, the response types of the same type are different, causing the method of regarding the message types as the alphabet will result in the loss of states or state transitions.To address the issue, message types were refined into subtypes according to the different responses and a progressive active inference method was proposed.The proposed method extracted the state fields from the existing protocol data to construct the initial alphabet and the mapper, and obtained the initial state machine based on active automata learning.It then mutated the existing messages to explore the response sequences, which were inconsistent with the current state machine.The mutated message was regarded as a protocol subtype and added to the alphabet, and a new state machine was inferred progressively based on the new alphabet.In order to reduce the interactions, a pre-response query algorithm was proposed based on prefix matching for the caching mechanism in the active automata learning.The ProLearner tool was utilized to evaluate the proposed method in the context of the SMTP and RSTP protocols.It is verified that the pre-response query method can effectively reduce the number of actual interactions, with an average reduction rate of about 10%.

Key words: protocol reverse analysis, active automata learning, protocol state machine inference, Mealy automata, mapper

中图分类号: 

No Suggested Reading articles found!