通信学报 ›› 2015, Vol. 36 ›› Issue (5): 174-186.doi: 10.11959/j.issn.1000-436x.2015101

• 学术通信 • 上一篇    

基于多维有限自动机的DFA改进算法

宫阳阳1,刘勤让1,杨镇西1,邵翔宇1,邢池强1,焦慧娟2,彭志彬1   

  1. 1 国家数字交换系统工程技术研究中心,河南 郑州 450002
    2 中国石油大学 化工学院,山东 青岛266555
  • 出版日期:2015-05-20 发布日期:2015-07-17
  • 基金资助:
    国家高技术研究发展计划(“863”计划)基金资助项目;国家高技术研究发展计划(“863”计划)基金资助项目;国家重点基础研究发展计划(“973”计划)基金资助项目;国家重点基础研究发展计划(“973”计划)基金资助项目;国家科技支撑计划基金资助项目

Improved DFA algorithm based on multi-dimensional finite automata

ONGYang-yang G1,IUQin-rang L1,ANGZhen-xi Y1,HAOXiang-yu S1,INGChi-qiang X1,IAOHui-juan J2,ENGZhi-bin P1   

  1. 1 National Digital Switching System Engineering R&D Center,Zhengzhou 450002,China
    2 Collge of Chemical Engineering,China University of Petroleum,Qingdao 266555,China
  • Online:2015-05-20 Published:2015-07-17
  • Supported by:
    The National High Technology Research and Development Program of China (863 Program);The National High Technology Research and Development Program of China (863 Program);The National Basic Research Program of China (973 Progrma);The National Basic Research Program of China (973 Progrma);The National Key Technology R&D Program

摘要:

多个正则表达式规则编译成一个DFA(deter minister finite automata)时,会产生状态爆炸、存储急剧增加的现象。针对最严重的状态爆炸问题,从信息论的角度给出了解释,并提出多维数学模型,将冗余状态分为0维状态和1维状态,通过前者按照维度压缩,后者动态构建的方法将空间复杂度降到理论下界,并在此基础上提出多维有限自动机(MFA,multi-dimensional finite automata)。实验表明,MFA构造时间比XFA略少,比DFA、STT冗余压缩算法和Hybrid-FA降低了2~3个数量级;存储空间比XFA略高,比DFA、STT冗余压缩算法、mDFA、Hybrid-FA降低了1~2个数量级;匹配时间比DFA、Hybrid-FA略多,但是比XFA略少,比STT冗余压缩算法和mDFA降低了1~2个数量级。

关键词: 正则表达式, DFA, 有限自动机, 状态爆炸

Abstract:

Compiling multiple regular expression signatures into a combined DFA can blowup in state and storage space.Explanations from the prospective of information theory and multi-dimensional mathematical model were proposed fo-cusing on the most serious state explosion.Redundancy states were divided into zero-dimensional ones and one-dimensional ones.The former were compressed by dimension,and the later were dynamically built.The space com-plexity of the model came to the theoretical lower bound by the above methods.Then the multi-dimensional finite auto-mata (MFA) was proposed with the model.Experiments show that,the construction time taken by MFA is slightly less than XFA and is 2~3 orders of magnitude lower than DFA,STT redundancy compression algorithms and Hybrid-FA; the memory space of MFA is slightly higher than XFA,but is 1~2 orders of magnitude lower than DFA,STT redundancy compression algorithms,mDFA and Hybrid-FA; in the aspect of matching time,MFA ranks after DFA and Hybrid-FA,but ranks before XFA,and it achieves 1~2 orders of magnitude lower than that of STT redundancy compression algo-rithms and mDFA.

Key words: regular expression, DFA, finite automata, state explosion

No Suggested Reading articles found!