通信学报 ›› 2021, Vol. 42 ›› Issue (8): 176-187.doi: 10.11959/j.issn.1000-436x.2021150

• 学术论文 • 上一篇    下一篇

基于描述语境特征词与改进GSDMM模型的服务聚类方法

胡强, 沈嘉吉, 荆广辉, 杜军威   

  1. 青岛科技大学信息科学技术学院,山东 青岛 266061
  • 修回日期:2021-06-29 出版日期:2021-08-25 发布日期:2021-08-01
  • 作者简介:胡强(1980- ),男,山东邹城人,青岛科技大学副教授、硕士生导师,主要研究方向为服务计算、人工智能
    沈嘉吉(1997- ),男,上海人,青岛科技大学硕士生,主要研究方向为服务计算
    荆广辉(1996- ),男,山东日照人,青岛科技大学硕士生,主要研究方向为文本挖掘、推荐系统
    杜军威(1974- ),男,山东文登人,青岛科技大学教授、博士生导师,主要研究方向为软件工程、人工智能
  • 基金资助:
    国家自然科学基金资助项目(61973180);山东省自然科学基金资助项目(ZR2019MF033);山东省重点研发计划基金资助项目(2018GGX101052);国家重点研发计划基金资助项目(2018YFB1702902)

Service clustering method based on description context feature words and improved GSDMM model

Qiang HU, Jiaji SHEN, Guanghui JING, Junwei DU   

  1. School of Information Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
  • Revised:2021-06-29 Online:2021-08-25 Published:2021-08-01
  • Supported by:
    The National Natural Science Foundation of China(61973180);The Natural Science Foundation of Shandong Province(ZR2019MF033);The Key Research and Development Program of Shandong Province(2018GGX101052);The National Key Research and Development Program of China(2018YFB1702902)

摘要:

针对现有聚类方法中存在的服务表征向量生成质量较差问题,提出了一种面向描述语境特征词与改进GSDMM模型的服务聚类方法。首先,构建了基于语境权重的特征词提取方法,将与服务描述语境契合度高的词语抽取出,构建用于服务表征向量生成的功能特征词集合。然后,建立了带有主题分布概率修正因子的GSDMM模型,实现服务表征向量的生成以及非关键主题项概率分布修正。最后,基于修正后的服务表征向量,采用K-means++算法实现服务聚类。以Programmable Web上真实服务进行了多轮次实验,实验结果表明,采用所提方法生成的服务表征向量质量显著高于其他常用主题模型,所构建的服务聚算法性能优于其他常用算法。

关键词: Web服务, 服务聚类, 主题模型, GSDMM

Abstract:

To address the problem that current service clustering methods usually faced low quality of service representation vectors, a service clustering method based on description context feature words and improved GSDMM model was proposed.Firstly, a feature word extraction method based on context weight was constructed.The words that fit well with the context of service description were extracted as the set of feature words for each service.Then, an improved GSDMM model with topic distribution probability correction factor was established to generate service representation vectors and achieve distribution probability correction for non-critical topic items.Finally, K-means++ algorithm was employed to cluster Web services based on these service representation vectors.Experiments were conducted on real Web services in Web site of Programmable Web.Experiment results show that the quality of service representation vectors generated by the proposed method is higher than of other topic models.Further, the performance of our clustering method is significantly better than other service clustering methods.

Key words: Web service, service clustering, topic model, GSDMM

中图分类号: 

No Suggested Reading articles found!