融合用戶興趣建模的智能推薦算法研究-AET-電子技術(shù)應(yīng)用

融合用戶興趣建模的智能推薦算法研究

信息技術(shù)與網(wǎng)絡(luò)安全 11期

洪志理，賴俊，曹雷，陳希亮

(陸軍工程大學(xué) 指揮控制工程學(xué)院，江蘇南京210007)

摘要： 強(qiáng)化學(xué)習(xí)被越來越多地應(yīng)用到推薦系統(tǒng)中。提出一種基于DDPG融合用戶動態(tài)興趣建模的推薦方法（DDPG-LA），使用LSTM網(wǎng)絡(luò)提取用戶的長期興趣，利用注意力機(jī)制方法提取用戶的短期興趣，將兩種興趣結(jié)合作為智能體的狀態(tài)。同時，在LSTM網(wǎng)絡(luò)中加入狀態(tài)增強(qiáng)單元，以加速模型對于用戶長期興趣的建模，在注意力機(jī)制中加入緩解推薦延遲的模塊來解決該方法應(yīng)用于推薦系統(tǒng)中時所產(chǎn)生的缺陷。在Movelines的兩個數(shù)據(jù)集上對模型進(jìn)行實驗，同時在各種測試指標(biāo)上與傳統(tǒng)方法進(jìn)行比較，結(jié)果顯示所提出的算法更具優(yōu)越性。

關(guān)鍵詞： 強(qiáng)化學(xué)習(xí) 推薦系統(tǒng) DDPG DDPG-LA LSTM

中圖分類號： TP18
文獻(xiàn)標(biāo)識碼： A
DOI： 10.19358/j.issn.2096-5133.2021.11.006
引用格式：洪志理，賴俊，曹雷，等. 融合用戶興趣建模的智能推薦算法研究[J].信息技術(shù)與網(wǎng)絡(luò)安全，2021，40(11)：37-48.

Research on intelligent recommendation algorithm integrating user interest modeling

Hong Zhili，Lai Jun，Cao Lei，Chen Xiliang

(Command & Control Engineering College，Army Engineering University of PLA，Nanjing 210007，China)

Abstract： Reinforcement learning is more and more applied to recommendation system. This paper proposes a recommendation method based on DDPG and user dynamic interest modeling(DDPG-LA). It uses LSTM network to extract user′s long-term interest and attention mechanism to extract user′s short-term interest. The two kinds of interest are combined as the state of agent. At the same time, the state enhancement unit is added to LSTM network to accelerate the modeling of users′ long-term interest, and the module to alleviate the recommendation delay is added to the attention mechanism to solve the defects when the method is applied to the recommendation system. In this paper, the model is tested on two data sets of Movelines, and compared with the traditional methods in various test indexes, the results show that the proposed algorithm has more advantages.

Key words : reinforcement learning; recommendation system；DDPG；DDPG-LA；LSTM；attention mechanism；long-term interest；short-term interest

0 引言

推薦系統(tǒng)[1]，作為大數(shù)據(jù)時代方便人們在龐大的可選項目中快速準(zhǔn)確定位到自己感興趣物品的工具，基本思想是通過構(gòu)建模型從用戶的歷史數(shù)據(jù)中提取用戶和物品的特征，利用訓(xùn)練好的模型對用戶有針對地推薦物品。

近年來隨著強(qiáng)化學(xué)習(xí)的快速發(fā)展，將強(qiáng)化學(xué)習(xí)應(yīng)用于推薦系統(tǒng)的研究越來越受到關(guān)注，首次將深度強(qiáng)化學(xué)習(xí)應(yīng)用于推薦系統(tǒng)的探索模型是DRN[2]，為深度強(qiáng)化學(xué)習(xí)在推薦系統(tǒng)中的應(yīng)用構(gòu)建了基本框架，圖1所示為基于深度強(qiáng)化學(xué)習(xí)的推薦系統(tǒng)框圖。

目前基于深度強(qiáng)化學(xué)習(xí)的推薦系統(tǒng)研究已有諸多研究成果，如童向榮[3]等人將DQN應(yīng)用于以社交網(wǎng)絡(luò)為基礎(chǔ)的信任推薦系統(tǒng)中，應(yīng)用于智能體學(xué)習(xí)用戶之間信任度的動態(tài)表示，并基于這種信任值來為用戶做推薦；劉帥帥[4]將DDQN應(yīng)用于電影推薦中來解決推薦精確度低、速度慢以及冷啟動等問題；Munemasa[5]等人將DDPG算法應(yīng)用于店鋪推薦，來解決用戶數(shù)據(jù)稀疏問題；Zhao[6]等人將Actor-Critic算法應(yīng)用于列表式推薦，來解決傳統(tǒng)推薦模型只能將推薦過程建模為靜態(tài)過程的問題。上述研究成果以及未在此羅列的眾多研究均是利用強(qiáng)化學(xué)習(xí)本身的性質(zhì)來解決推薦問題，很少從推薦角度出發(fā)考慮問題。

本文詳細(xì)內(nèi)容請下載：http://forexkbc.com/resource/share/2000003846

作者信息：

洪志理，賴俊，曹雷，陳希亮

(陸軍工程大學(xué) 指揮控制工程學(xué)院，江蘇南京210007)

原創(chuàng)聲明：此內(nèi)容為AET網(wǎng)站原創(chuàng)，未經(jīng)授權(quán)禁止轉(zhuǎn)載。

相關(guān)內(nèi)容