Journal on Communications ›› 2014, Vol. 35 ›› Issue (12): 196-202.doi: 10.3969/j.issn.1000-436x.2014.12.023

• Correspondences • Previous Articles     Next Articles

Large-scale duplicate image retrieval technical research for the internet

Shu-peng WANG1,Ming CHEN2,Guang-jun WU1   

  1. 1 Institute of Information Engineering,Chinese Academy of Sciences,Beijing 100093,China
    2 School of Software Engineering,Zhengzhou University of Light Industry,Zhengzhou 450000,China
  • Online:2014-12-25 Published:2017-06-17
  • Supported by:
    The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National High Technology Re-search and Development Program of China (863 Program);The National High Technology Re-search and Development Program of China (863 Program);The National High Technology Re-search and Development Program of China (863 Program);Beijing Municipal Science and Technology Project

Abstract:

For the typical social media application on the internet,a large-scale distributed duplicate image retrieval ap-proach based on random projection and the block DCT coefficients was proposed.On the basis of Hadoop,this approach exploited image signatures generated by random projection mapping to retrieve HBase efficiently.And candidate images with high-recall were achieved.Then in order to improve the retrieval precision,the block DCT coefficients were used to further filter candidate images.For 12 million images,experimental results showed that with our approach the recall ratio reached 98%,the precision ratio reached 93.2%,and the average retrieval time was 6.7s when H=2 and T=150.

Key words: social media, random projection mapping, image signature, block DCT coefficients, Hadoop cluster

No Suggested Reading articles found!