大数据

• •    

知识增强策略引导的交互式强化推荐系统

张宇奇; 黄晓雯; 桑基韬   

  1. 北京交通大学

Knowledge-enhanced Policy-guided Interactive Reinforcement Recommendation System

Zhang yuqi, huang xiaowen, sang jitao   

  1. Beijing Jiaotong University

摘要: 推荐系统是解决社会媒体信息过载问题的重要手段。为了解决传统推荐系统无法优化用户长期体验的问题,研究者们提出了交互式推荐系统,并尝试使用深度强化学习优化推荐策略。但是,强化推荐算法面临着反馈稀疏、从零学习损害用户体验、物品空间大等问题。为了解决上述问题,本文提出一种改进的知识增强策略引导的交互式强化推荐模型KGP-DQN。该模型构建行为知识图谱表示模块,将用户历史行为和知识图谱结合,解决反馈稀疏问题;构建策略初始化模块,根据用户历史行为为强化推荐系统提供初始化策略,解决从零学习损害用户体验问题;构建候选集筛选模块,根据行为知识图谱上的物品表示进行动态聚类从而减少物品空间,解决动作空间大的问题。本文在三个真实数据集上进行了实验,实验结果表明,KGP-DQN方法可以快速有效地对强化推荐系统进行训练,并取得不错的推荐效果。

关键词: 交互式推荐系统, 深度强化学习, 知识图谱, 策略初始化, 候选集筛选

Abstract: The recommendation system is an important means to solve the problem of information overload in social media. In order to solve the problem that traditional recommendation systems cannot optimize the long-term user experience, researchers have proposed the interactive recommendation system and tried to use deep reinforcement learning to optimize strategy of recommendation. However, the reinforcement recommendation algorithm faces problems such as sparse feedback, learning from zero which damages the user experience, and large item space. In order to solve the above problems, this paper proposes an improved interactive reinforcement  recommendation model KGP-DQN. The model constructs a behavioral knowledge graph representation module, which combines user historical behavior and knowledge graph to solve the problem of sparse feedback; constructs a strategy initialization module to provide an initialization strategy for the reinforcement recommendation system based on user historical behaviors to solve the problem of learning from zero; constructs the candidate select module which creates candidate by dynamic clustering based on the item representation on the behavioral knowledge graph to solve the problem of large action space. This paper conducts experiments on three real-world datasets. The experimental results show that the KGP-DQN method can quickly and effectively train the reinforcement recommendation system and achieve good recommendation results.

Key words: interactive recommendation system, deep reinforcement learning, knowledge graph, policy initialization, candidate select

No Suggested Reading articles found!