[1] |
DAN J , JAMES H M . Speech and language processing (3rded.draft)[EB].
|
[2] |
李生 . 自然语言处理的研究与发展[J]. 燕山大学学报, 2013,37(5): 377-384.
|
|
LI S . Research and development of natural language processing[J]. Journal of Yanshan University, 2013,37(5): 377-384.
|
[3] |
王慧芳, 曹靖, 罗麟 . 电力文本数据挖掘现状及挑战[J]. 浙江电力, 2019,38(3): 1-7.
|
|
WANG H F , CAO J , LUO L . Current status and challenges of power text data mining[J]. Zhejiang Electric Power, 2019,38(3): 1-7.
|
[4] |
邱剑 . 电力中文文本数据挖掘技术及其在可靠性中的应用研究[D]. 杭州:浙江大学, 2016.
|
|
QIU J . Research on power Chinese text data mining techndogy and reliability application[D]. Hangzhou:Zhejiang University, 2016.
|
[5] |
汪崔洋, 江全元, 唐雅洁 ,等. 基于告警信号文本挖掘的电力调度故障诊断[J]. 电力自动化设备, 2019,39(4): 126-132.
|
|
WANG C Y , JIANG Q Y , TANG Y J ,et al. Fault diagnosis of power dispatching based on alarm signal text mining[J]. Electric Power Automation Equipment, 2019,39(4): 126-132.
|
[6] |
刘梓权, 王慧芳, 曹靖 ,等. 基于卷径神经网络的电力设备缺陷文本分类模型研究[J]. 电网技术, 2018,42(2): 644-650.
|
|
LIU Z Q , WANG H F , CAO J ,et al. A classification model of power equipment defect texts based on convolutional neural network[J]. Power System Technology, 2018,42(2): 644-650.
|
[7] |
王春柳, 杨永辉, 邓霏 ,等. 文本相似度计算方法研究综述[J]. 情报科学, 2019,37(3): 158-168.
|
|
WANG C L , YANG Y H , DENG F ,et al. A review of text similarity approaches[J]. Information Science, 2019,37(3): 158-168.
|
[8] |
沈斌 . 基于分词的中文文本相似度计算研究[D]. 天津:天津财经大学, 2006.
|
|
SHEN B . Study on chinese text similarity computing based on word segmentation[D]. Tianjin:Tianjin University of Finance and Economics, 2006.
|
[9] |
JONES K S . A statistical interpretation of term specificity and its application in retrieval[J]. Journal of Documentation, 1972,28(1): 11-21.
|
[10] |
SALTON G , YU C T . On the construction of effective vocabularies for information retrieval[C]// Proceedings of ACM SIGIR Forum. New York:ACM Press, 1973: 48-60.
|
[11] |
MIKOLOV T , CORRADO G , CHEN K ,et al. Efficient estimation of word representations in vector space[C]// Proceedings of the International Conference on Learning Representations (ICLR 2013).[S.l.:s.n.], 2013.
|
[12] |
ROBERTSON S E , ZARAGOZA H . The probabilistic relevance framework:BM25 and beyond[J]. Foundations and Trends in Information Retrieval, 2009,3(4): 333-389.
|
[13] |
FRIEDMAN J H . Greedy function approximation:a gradient boosting machine[J]. The Annals of Statistics, 2001,29(5): 1189-1232.
|
[14] |
KE G , MENG Q , FINLEY T ,et al. LightGBM:a highly efficient gradient boosting decision tree[C]// Advances in Neural Information Processing Systems (NIPS).[S.l.:s.n.], 2017.
|
[15] |
曹小芹 . 基于Python的中文结巴分词技术实现[J]. 信息与电脑, 2019(18): 38-39,42.
|
|
CAO X Q . Technology implementation of Chinese jieba segmentation based on Python[J]. China Computer & Communication, 2019(18): 38-39,42.
|
[16] |
陈正铭, 霍英 . 编辑距离算法在中文文本相似度计算中的优化与实现[J]. 韶关学院学报, 2019,36(12): 8-12.
|
|
CHEN Z M , HUO Y . Optimization and implementation of the edit distance algorithm in chinese similarity calculation[J]. Journal of Shaoguan University, 2019,36(12): 8-12.
|
[17] |
余小军, 刘峰, 张春 . 基于 N-Gram 文本特征提取的改进算法[J]. 现代计算机, 2012(23): 3-7.
|
|
YU X J , LIU F , ZHANG C . Improved text feature extraction algorithm based on N-Gram[J]. Modern Computer, 2012(23): 3-7.
|