电信科学 ›› 2020, Vol. 36 ›› Issue (3): 71-82.doi: 10.11959/j.issn.1000-0801.2020055

• 研究与开发 • 上一篇    下一篇

演化森林哈希:一种无监督的在线哈希学习算法

寿震宇,钱江波(),董一鸿,陈华辉   

  1. 宁波大学信息科学与工程学院,浙江 宁波 315211
  • 修回日期:2020-02-26 出版日期:2020-03-20 发布日期:2020-03-26
  • 作者简介:寿震宇(1993- ),男,宁波大学信息科学与工程学院硕士生,主要研究方向为机器学习、人工智能、大数据检索|钱江波(1974- ),男,博士,宁波大学信息科学与工程学院教授,主要研究方向为数据处理与挖掘、逻辑电路设计、多维索引与查询优化|董一鸿(1969- ),男,博士,宁波大学信息科学与工程学院教授,主要研究方向为大数据、数据挖掘和人工智能|陈华辉(1964- ),男,博士,宁波大学信息科学与工程学院教授,主要研究方向为数据处理与挖掘、云计算
  • 基金资助:
    浙江省自然科学基金资助项目(LZ20F020001);浙江省自然科学基金资助项目(LY20F020009);国家自然科学基金资助项目(61472194);国家自然科学基金资助项目(61572266);宁波市自然科学基金资助项目(2019A610085)

EFH:an online unsupervised hash learning algorithm

Zhenyu SHOU,Jiangbo QIAN(),Yihong DONG,Huahui CHEN   

  1. Faculty of Electrical Engineering and Computer Science,Ningbo University,Ningbo 315211,China
  • Revised:2020-02-26 Online:2020-03-20 Published:2020-03-26
  • Supported by:
    Zhejiang Provincial Natural Science Foundation of China(LZ20F020001);Zhejiang Provincial Natural Science Foundation of China(LY20F020009);The National Natural Science Foundation of China(61472194);The National Natural Science Foundation of China(61572266);Ningbo Municipal Natural Science Foundation of China(2019A610085)

摘要:

目前的无监督哈希学习算法在训练阶段需要加载全部的数据,会占据较大的内存空间,并且无法适用于流式数据。探索性地提出了一种无监督在线哈希学习算法——演化森林哈希。针对大规模数据检索场景,通过改进后的演化树学习数据的空间拓扑结构,并提出了路径编码策略将数据点遍历演化树时的路径映射为保相似性二进制编码。为了进一步提高编码查询性能,在演化树哈希的基础上进一步提出在线演化森林哈希,最后在两个被广泛使用的数据集上用实验证明了本文方法的可行性。

关键词: 最近邻查询, 演化树, 在线, 哈希学习, 集成学习

Abstract:

Many unsupervised learning to hash algorithm needs to load all data to memory in the training phase,which will occupy a large memory space and cannot be applied to streaming data.An unsupervised online learning to hash algorithm called evolutionary forest hash (EFH) was proposed.In a large-scale data retrieval scenario,the improved evolution tree can be used to learn the spatial topology of the data.A path coding strategy was proposed to map leaf nodes to similarity-preserved binary code.To further improve the querying performance,ensemble learning was combined,and an online evolving forest hashing method was proposed based on the evolving trees.Finally,the feasibility of this method was proved by experiments on two widely used data sets.

Key words: nearest neighbor query, evolving tree, online, hash learning, ensemble learning

中图分类号: 

No Suggested Reading articles found!