通信学报 ›› 2016, Vol. 37 ›› Issue (3): 139-147.doi: 10.11959/j.issn.1000-436x.2016061

• 学术论文 • 上一篇    下一篇

CBFM:支持属性删减的布鲁姆过滤器矩阵多维元素查询算法

王勇1,云晓春1,2,王树鹏1,王曦1   

  1. 1 中国科学院信息工程研究所 北京100093
    2 国家计算机网络应急技术处理协调中心 北京100029
  • 出版日期:2016-03-25 发布日期:2017-08-04
  • 基金资助:
    国家自然科学基金资助项目;国家自然科学基金资助项目;国家自然科学基金资助项目;国家自然科学基金资助项目;国家高技术研究发展计划(“863”计划)基金资助项目;国家高技术研究发展计划(“863”计划)基金资助项目;国家高技术研究发展计划(“863”计划)基金资助项目

CBFM:cutted Bloom filter matrix for multi-dimensional membership query

Yong WANG1,Xiao-chun YUN1,2,ANGShu-peng WANG1,Xi WANG1   

  1. 1 Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100093,China
    2 National Computer Network Emergency Response Technical Team/Coordination Center of China,Beijing 100029,China
  • Online:2016-03-25 Published:2017-08-04
  • Supported by:
    The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National High Technology Research and Development Program of China (863 Program);The National High Technology Research and Development Program of China (863 Program);The National High Technology Research and Development Program of China (863 Program)

摘要:

为了提升多维元素成员查询的灵活性和准确率,提出了一种新型索引结构CBFM(cutted Bloom filter matrix)。该索引方法通过独立属性布鲁姆过滤器笛卡尔乘积构建位矩阵,支持任意属性组合的多维元素成员查询,同时支持属性组合按需删减和属性加权,极大地提升内存空间利用率,降低查询误判率。理论分析证明相比于BFM(Bloom filter matrix)索引方法,CBFM具有更高的内存利用率。仿真实验表明,在分配内存相同的情况下,CBFM方法相比于其他方法,具有最低的查询误判率,特别在内存受限场景下,CBFM相比于BFM方法,查询误判率最大降低3个数量级,极大地提升了多维元素成员查询的准确率。

关键词: 查询算法, 多维元素成员查询, 布鲁姆过滤器, 位矩阵

Abstract:

In order to improve the flexibility and accuracy of mu i-dimensional membership query,a new indexing structure called CBFM(cutted Bloom filter matrix)was proposed.CBFM built the bit matrix by the Cartesian product of different bloom filters,each representing one attribute of primary data.In this way,the proposed matrix supported by-attribute membership query.Besides,the attribute combinations in the bit matrix could be reduced and weighted on demand to further enhance memory utilization rate.Theoretical analysis proves that CBFM utilizes memory more effi-ciently than BFM,the current state of art.Experiments also show that,on the scenario of memory size fixed,the false positive rate of CBFM is lower than that of all other indexin ethods.Especially on the scenario of memory constrained,the false positive rate of CBFM can be 3 orders of magnitude lower than that of BFM(Bloom filter matrix) indexing me-thod.CBFM is an accurate data structure for multi-dimensional membership query.

Key words: query algorithm, multi-dimensional membership query, Bloom filter, bit matrix

No Suggested Reading articles found!