Journal on Communications ›› 2012, Vol. 33 ›› Issue (12): 43-48.doi: 10.3969/j.issn.1000-436x.2012.12.006

• Papers • Previous Articles     Next Articles

Novel kernel function for computing the similarity of text

Xiu-hong WANG1,2,3,4,Shi-guang JU1   

  1. 1 Institute of Science and Technology Information,Jiangsu University,Zhenjiang 212013,China
    2 Faculty of Science,Jiangsu University,Zhenjiang 212013,China
    3 College of Agricultural and Environmental Sciences,University of California-Davis,Davis 95616,USA
    4 School of Computer Science and Telecommunication Engineering,Jiangsu University,Zhenjiang 212013,China
  • Online:2012-12-25 Published:2017-07-15

Abstract:

To enhance the performance of detecting similar documents,a novel kernel function named S_Wang kernel was constructed.Based on the actual situation of computing text similarity,the S_Wang kernel was newly bu lt with consideration of the Euclidean distance and angle between vectors that represented the text documents to be compared.It was proved that the function could be constructed as a kernel function according to Mercer theorem.Experimental verification of the performance of the kernels in the text document similarity calculation was provided.The results show that the S_Wang kernel is significantly better than the precision and F1 performance of other kernels like Cauchy kernel,Latent Semantic Kernel (LSK) and CLA kernel.S_Wang kernel is suitable for text similarity computation.

Key words: information retrieval, text similarity, kernel function, S_Wang kernel, LSK, Cauchy kernel, CLA kernel

No Suggested Reading articles found!