融合用户兴趣建模的智能推荐算法研究-AET-电子技术应用

融合用户兴趣建模的智能推荐算法研究

信息技术与网络安全 11期

洪志理，赖俊，曹雷，陈希亮

(陆军工程大学指挥控制工程学院，江苏南京210007)

摘要： 强化学习被越来越多地应用到推荐系统中。提出一种基于DDPG融合用户动态兴趣建模的推荐方法（DDPG-LA），使用LSTM网络提取用户的长期兴趣，利用注意力机制方法提取用户的短期兴趣，将两种兴趣结合作为智能体的状态。同时，在LSTM网络中加入状态增强单元，以加速模型对于用户长期兴趣的建模，在注意力机制中加入缓解推荐延迟的模块来解决该方法应用于推荐系统中时所产生的缺陷。在Movelines的两个数据集上对模型进行实验，同时在各种测试指标上与传统方法进行比较，结果显示所提出的算法更具优越性。

关键词： 强化学习推荐系统 DDPG DDPG-LA LSTM

中图分类号： TP18
文献标识码： A
DOI： 10.19358/j.issn.2096-5133.2021.11.006
引用格式：洪志理，赖俊，曹雷，等. 融合用户兴趣建模的智能推荐算法研究[J].信息技术与网络安全，2021，40(11)：37-48.

Research on intelligent recommendation algorithm integrating user interest modeling

Hong Zhili，Lai Jun，Cao Lei，Chen Xiliang

(Command & Control Engineering College，Army Engineering University of PLA，Nanjing 210007，China)

Abstract： Reinforcement learning is more and more applied to recommendation system. This paper proposes a recommendation method based on DDPG and user dynamic interest modeling(DDPG-LA). It uses LSTM network to extract user′s long-term interest and attention mechanism to extract user′s short-term interest. The two kinds of interest are combined as the state of agent. At the same time, the state enhancement unit is added to LSTM network to accelerate the modeling of users′ long-term interest, and the module to alleviate the recommendation delay is added to the attention mechanism to solve the defects when the method is applied to the recommendation system. In this paper, the model is tested on two data sets of Movelines, and compared with the traditional methods in various test indexes, the results show that the proposed algorithm has more advantages.

Key words : reinforcement learning; recommendation system；DDPG；DDPG-LA；LSTM；attention mechanism；long-term interest；short-term interest

0 引言

推荐系统[1]，作为大数据时代方便人们在庞大的可选项目中快速准确定位到自己感兴趣物品的工具，基本思想是通过构建模型从用户的历史数据中提取用户和物品的特征，利用训练好的模型对用户有针对地推荐物品。

近年来随着强化学习的快速发展，将强化学习应用于推荐系统的研究越来越受到关注，首次将深度强化学习应用于推荐系统的探索模型是DRN[2]，为深度强化学习在推荐系统中的应用构建了基本框架，图1所示为基于深度强化学习的推荐系统框图。

目前基于深度强化学习的推荐系统研究已有诸多研究成果，如童向荣[3]等人将DQN应用于以社交网络为基础的信任推荐系统中，应用于智能体学习用户之间信任度的动态表示，并基于这种信任值来为用户做推荐；刘帅帅[4]将DDQN应用于电影推荐中来解决推荐精确度低、速度慢以及冷启动等问题；Munemasa[5]等人将DDPG算法应用于店铺推荐，来解决用户数据稀疏问题；Zhao[6]等人将Actor-Critic算法应用于列表式推荐，来解决传统推荐模型只能将推荐过程建模为静态过程的问题。上述研究成果以及未在此罗列的众多研究均是利用强化学习本身的性质来解决推荐问题，很少从推荐角度出发考虑问题。

本文详细内容请下载：http://www.chinaaet.com/resource/share/2000003846

作者信息：

洪志理，赖俊，曹雷，陈希亮

(陆军工程大学指挥控制工程学院，江苏南京210007)

原创声明：此内容为AET网站原创，未经授权禁止转载。

相关内容