Journal on Communications ›› 2021, Vol. 42 ›› Issue (10): 173-181.doi: 10.11959/j.issn.1000-436x.2021192

• Papers • Previous Articles     Next Articles

Text similarity detection method based on NLP

Xiaoli DAI1,2, Shifeng LIU1, Daqing GONG1   

  1. 1 School of Economics and Management, Beijing Jiaotong University, Beijing 100044, China
    2 China InfoCom Media Group, Beijing 100078, China
  • Revised:2021-09-13 Online:2021-10-25 Published:2021-10-01
  • Supported by:
    The National Natural Science Foundation of China(J1824031)

Abstract:

Current text similarity detection methods that ignore document structure information and lack semantic relevance.To solve these problems, a text-oriented similarity detection method was proposed.First, analytic hierarchy process (AHP) was used to calculate word position weight to extract feature words.Second, the Pearson correlation coefficient was used to measure semantic correlation between words which was the weight of generalized Dice coefficient to calculate similarity.Experimental results show that the proposed method can improve the precision of feature word extraction and the accuracy of similarity calculation results.

Key words: text similarity, word position weight, analytic hierarchy process,, feature word extraction, Pearson correlation coefficient

CLC Number: 

No Suggested Reading articles found!