电信科学 ›› 2016, Vol. 32 ›› Issue (9): 139-145.doi: 10.11959/j.issn.1000-0801.2016238

• 运营技术广角 • 上一篇    下一篇

基于对话内容的交互型文本会话主题挖掘

彭杰1,石永革1,高胜保2   

  1. 1 南昌大学信息工程学院,江西 南昌 330029
    2 中国电信股份有限公司江西分公司,江西 南昌 330029
  • 出版日期:2016-09-15 发布日期:2016-10-20
  • 基金资助:
    国家自然科学基金资助项目;江西省科技计划基金资助项目

Session topic mining for interactive text based on conversational content

Jie PENG1,Yongge SHI1,Shengbao GAO2   

  1. 1 Information Engineering College,Nanchang University,Nanchang 330029,China
    2 Jiangxi Branch of China Telecom Co.,Ltd.,Nanchang 330029,China
  • Online:2016-09-15 Published:2016-10-20
  • Supported by:
    The National Natural Science Foundation of China;Science and Technology Program of Jiangxi Province of China

摘要:

传统的主题挖掘模型一般仅从交互型文本中挖掘出文档主题,为了能够从中挖掘出会话主题并提高挖掘模型的普适性,提出了一种基于对话内容的交互型文本会话主题生成模型。首先通过分析交互型文本的特征,基于主题树的概念,定义了一个5层结构的对话生成树。以此为基础,再基于LDA构建会话主题生成模型(ST-LDA)。最后采用吉布斯抽样法对ST-LDA进行推导,得到会话主题及其分布概率。使用实际数据进行验证,结果表明,ST-LDA模型可以从交互型文本中有效地挖掘出会话主题。此外,成果可以降低分类算法的复杂度,回溯主题—参与者关联关系,具有较好的普适性。

关键词: 交互型文本, 对话内容, 会话主题挖掘, 对话生成树, LDA

Abstract:

Traditional theme mining model generally digs out the document theme from the interactive text only.In order to explore the session topic and improve the universality of mining model,a kind of interactive text session topic generation model based on the content of the dialogue was put forward.Firstly,by analyzing the characteristics of interactive text and based on the concept of topic tree,a dialog spanning tree was defined with a five-layer structure.Based on this and LDA,the model of session topic generation(ST-LDA)was built.At last,Gibbs sampling method was adopted to deduce the ST-LDA and obtaining session topic and its distribution probability.The results show that the ST-LDA model can dig out a session topic effectively from the interactive text.Besides,the results can reduce the complexity of the classification algorithm and can be back to the theme—participants association.It also has a good universality.

Key words: interactivetext, conversationcontent, sessiontopicmining, dialogspanningtree, latentDirichletallocation

No Suggested Reading articles found!