Background
Today, we are undergoing a transition from the Information Technology (IT) to the Data Technology (DT), which is marked by the following more obvious: the information is overloaded. How to quickly help a particular user find interesting information from a huge amount of information? There are two related solutions: search engines and recommendation systems. The search engine needs the user to accurately describe the own requirements, and the recommendation system discovers the personalized requirements and interest characteristics of the user by analyzing and mining the user behaviors and recommends information or articles which the user may be interested in to the user. An excellent recommendation system can well establish a user, a merchant and a platform side in series and allows the three parties to gain profits, so that the recommendation system not only obtains a great deal of attention and research in academic circles, but also obtains wide application in various application scenes, and gradually becomes a standard configuration in most fields.
An electronic commerce website is a large application field of a personalized recommendation system. Personalized recommendation systems are also an important application in movie and video websites. It can help users find videos of interest to them in vast libraries of videos. In the aspect of social network, personalized item recommendation can be performed on the user by using the social network information of the user, friends can be recommended to the user, and the like. The recommendation of personalized advertisements also becomes a hot spot of ongoing continuous interest. In addition, personalized music recommendation, news reading recommendation, application in geographic position and the like are provided. In a word, the recommendation system is widely available, has extremely high commercial value and brings great convenience to learning and life.
The greatest advantage of personalized recommendations is that it can collect user profiles and actively make personalized recommendations for the user based on user characteristics, such as interest preferences. Moreover, the recommendation given by the system can be updated in real time, and when the commodity library or the user characteristic library in the system is changed, the given recommendation sequence can be automatically changed. This greatly improves the ease and effectiveness of e-commerce activities, as well as the level of service for the enterprise. If the recommendation quality of the recommendation system is high, the user may rely on the recommendation system. Therefore, the personalized recommendation system not only can provide personalized recommendation service for the user, but also can establish a long-term stable relationship with the user, thereby effectively retaining the customer, improving the loyalty of the customer and the website click rate, and preventing the customer from losing. Under the increasingly severe competitive environment, the personalized recommendation system can effectively reserve customers, improve the service capability of the electronic commerce system, bring great convenience to the life generations of the users and bring great economic benefits to companies.
The most important module in the recommendation system is the recommendation algorithm, and the most widely applied module in the recommendation algorithm is the Collaborative Filtering CF (Collaborative Filtering CF). The CFs are largely classified into two categories, User-based collaborative filtering algorithms (User-based CFs) and Item-based collaborative filtering algorithms (Item-based CFs). The core idea of Item-based CF is to recommend to users items similar to those they like before, so the algorithm is mainly divided into two steps: (1) calculating the similarity between the articles; (2) and generating a recommendation list for the user according to the similarity of the articles and the historical behaviors of the user.
Early Item-based CFs only used pearson's coefficient and cosine similarity to compute the similarity between items. This approach is too simple, requires manual tuning and the method after tuning cannot be applied directly to a new data set. Recently model-based methods have been used that customize an objective function to learn similarity matrices directly from data by minimizing the loss between the original user-Item interaction matrix and the interaction matrix reconstructed by the Item-based CF model. The performance of the method in scoring tasks and Top-k recommendation is superior to that of the traditional heuristic-based method. However, the number of items in the recommendation system is often huge, and the learning of the similarity matrix is highly complex. Secondly it can only estimate the similarity between two items purchased together or scored together, cannot estimate the similarity between unrelated items and therefore cannot capture the transitive relationship between items. Later Kabbur et al proposed a feature item similarity model that represented items as an embedded vector, with the similarity between two items parameterized as the inner product of the two item embedded vectors. When the user has new interaction, only the similarity between the new article and the predicted article (also called a target article) needs to be calculated, and then the similarity is accumulated with the original similarity to obtain the user's favorite degree of the target article. Therefore, the method is very suitable for online recommendation tasks, and experimental results of a plurality of data sets with different sparsity degrees show that the method can effectively process sparse data sets. This model also has a drawback in that it assumes that historical items that the user has interacted with contribute the same in predicting the user's preferences for the target item. This does not fit into the real recommendation scenario, and to address this deficiency He et al propose a neuro-attention item similarity model, called NAIS, which uses an attention mechanism to assign a weight to each historical item to distinguish their different contributions to the user's preferences. However, the user interests change constantly, a single neural model has strong generalization ability due to the problems that the depth of the neural model is deep, the model is complex and the like, but the most original interaction information between the user and the object is ignored, the model lacks of memory ability, and the recommended part of the object possibly deviates from the user interests.
Disclosure of Invention
The invention aims to solve the technical problem that the object recommendation method based on the generalized neural attention aims at overcoming the defects of the prior art, and improves the object recommendation accuracy;
in order to solve the technical problems, the technical scheme adopted by the invention is an article recommendation method based on generalized neural attention, which comprises the following steps of:
step 1, combining a generalized matrix factorization model GMF and a neural attention similarity model NAIS to establish a generalized neural attention recommendation model GNAS;
the generalized matrix factorization model GMF has the following formula:
wherein ,
representing the preference degree of the user u to the target item i, j being a historical item interacted by the user u before, p
i and q
jRepresenting the target item vector to be predicted and the historical item vector interacted by the user, ⊙ representing the dot product between the vectors, h
TIs a convolution layer for extracting more characteristic information between users and articles, improving generalization ability,
preventing self-recommendation;
step 1.1, constructing a generalized nerve attention recommendation model;
generating a sparse potential vector of a target item through one-hot encoding, and then generating a potential vector of a user through multi-hot encoding on a historical item j interacted by the user u; the user and the article are embedded into the vector by the embedding layer; the generalized matrix factorization model GMF and the neural attention similarity model NAIS are made to share user and article embedding vectors to obtain a generalized neural attention recommendation model GNAS, and the following formula is shown:
wherein ,
is a convolutional layer, the purpose of which is to prevent the disappearance of the gradient caused by the direct addition of the dot product result to the generalized neural attention recommendation model GNAS, a
ijIs an attention weight used to calculate the contribution of the interacted historical item j when the user u predicts the preference of the target item i, and is parameterized as a softmax function variant with respect to an attention function f, as shown in the following formula:
β is a penalty coefficient, whose value range is [0,1], for reducing the penalty of the model to the active users whose history interactive articles exceed the threshold;
said attention function f is divided by a generalized matrix factorCombining solution models GMF and MLP and passing through vectors
Mapping to the output layer, as shown in the following equation:
wherein the input of the attention function f is p
i and q
jW and b represent the weight matrix and the bias vector, respectively, ReLU is the activation function,
is a set of vectors that need to be trained in order to project the results from the hidden layer to the output layer, the weight matrix W dimension and h
TThe dimensions of (a) are corresponding;
step 1.2, pre-training the constructed generalized neural attention recommendation model;
in the pre-training process, initializing an item vector in a generalized neural attention recommendation model GNAS by using an item embedding vector trained by a factorization item similarity model FISM instead of random initialization; other parameters to be learned
h
TW, b initialized with a Gaussian distribution;
step 2, optimizing the model by using an attention mechanism integrated by GMF and a Multilayer Perceptron (MLP) in the model;
step 2.1, establishing an objective function of the model, wherein the objective function is shown in the following formula:
where L is the loss and σ is the sigmoid function, the purpose is to predict the result
Is limited to (0, 1), R
+ and R
-Representing positive example set of user-interacted article composition and negative example set of user-non-interacted article composition, the sum of the positive example set and the negative example set is training example number N, theta represents all training parameters, including p
i、q
j、
h
TW, b, λ is control L
2The degree of regularization to prevent overfitting hyper-parameters;
step 2.2, in order to minimize the objective function, a self-adaptive gradient algorithm Adagarad is adopted to automatically adjust the learning rate of parameters in training; for each positive case (u, i), a proportion of negative cases are randomly drawn to pair with in no observed interaction.
Step 3, after the model is optimized, predicting the preference degree of the user to the target object through the optimized generalized neural attention recommendation model, and generating an individualized recommendation list for the user;
adopt the produced beneficial effect of above-mentioned technical scheme to lie in:
according to the object recommendation method based on the generalized neural attention, the generalized matrix decomposition model is used for memorizing the second-order relation between the user and the object, the neural attention similarity method is combined, the potential interests and hobbies of the user are mined, and the interpretability and diversity of a recommendation system are improved; and secondly, estimating the weight of each historical article when predicting the preference degree of the target article by adopting an attention mechanism combining a GMF model and an MLP model, greatly improving the recommendation accuracy rate by using less time cost, and recommending articles which are more in line with the interest of the user.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
A generalized neural attention-based item recommendation method is disclosed, as shown in FIG. 1, a generalized matrix factorization model (GMF) and a neural attention similarity model (NAIS) are combined to establish a generalized neural attention recommendation model GNAS, an attention model integrating GMF and MLP is used in the model, after the model is optimized, the preference degree of a user to a target item is predicted through the optimized generalized neural attention recommendation model, and a personalized recommendation list is generated for the user;
the generalized matrix factorization model (GMF) is the following equation:
wherein ,
representing the preference degree of the user u to the target item i, j being a historical item interacted by the user u before, p
i and q
jRepresenting the target item vector to be predicted and the historical item vector interacted by the user, ⊙ representing the dot product between the vectors, h
TIs a convolution layer for extracting more characteristic information between users and articles, improving generalization ability,
preventing self-recommendation;
matrix Factorization (MF) is the most popular collaborative filtering algorithm in the recommendation field. The idea of item-based matrix decomposition is to simulate a true user's click rate or scored matrix on an item by multiplying the user's low-dimensional potential vector matrix with the item's low-dimensional potential vector matrix. Generating respective sparse feature vectors by a user and an article to be predicted through one-hot coding, and respectively obtaining embedding vectors of the user and the article to be predicted through an embedding layer by the obtained vectors; compared to a general MF model, a generalized MF model may be more expressive in modeling interactions between users and historical artifacts, and is therefore named GMF.
The specific method for establishing the generalized neural attention recommendation model GNAS by combining the generalized matrix factorization model (GMF) and the neural attention similarity model (NAIS) comprises the following steps:
(1) construction of generalized neural attention recommendation model
Generating a sparse potential vector of a target item through one-hot, and then coding historical items interacted by a user through Multi-hot to obtain a potential vector (Multi-hot) of the user; the user and the article are embedded into the vector by the embedding layer; the generalized matrix factorization model GMF and the neural attention similarity model NAIS are made to share user and article embedding vectors to obtain a generalized neural attention recommendation model GNAS, and the following formula is shown:
wherein ,
is a convolutional layer, the purpose of which is to prevent the disappearance of the gradient caused by the direct addition of the dot product result to the generalized neural attention recommendation model GNAS, a
ijIs an attention weight used to calculate the contribution of the interacted historical item j when the user u predicts the preference of the target item i, and is parameterized as a softmax function variant with respect to an attention function f, as shown in the following formula:
the variation of the Softmax function is mainly that an index is added to a denominator to convert the attention weight into a probability distribution, wherein β is a penalty coefficient, the value range of which is [0,1], and the penalty of the model on active users with historical interactive articles exceeding a threshold value is reduced;
the GNAS model uses only MLP as an attentiveness mechanism to model deep relationships between historical items and target items, and lacks a wide kernel to remember the most primitive information between the user and the item. To solve the problem, a GMF method is also added to an attention function mechanism to form an integrated attention network to calculate the contribution of historical articles to the representation of user preference, and the calculated weight can more comprehensively model the complex user-article interaction in the user decision process. Integral framework and attention mechanism of model a
ijThe attention function f is combined by two models of a generalized matrix factorization model GMF and an MLP and is combined by a vector
Mapping to the output layer, as shown in the following equation:
wherein the input of the attention function f is p
i and q
jW and b represent the weight matrix and the bias vector, respectively, ReLU is the activation function,
is a set of vectors that need to be trained in order to project the results from the hidden layer to the output layer, h
TIs a convolutional layer, whose dimensions correspond to the weight matrix W,
h
Tw, b were learned from experimental data;
the GMF model is introduced and integrated on the basis of an original neural attention similarity model (NAIS), the integrated attention mechanism is adopted to calculate the weight of the historical articles interacted by the user, and the most advanced performance is provided in the article recommendation scene based on implicit feedback. (2) Pre-training the constructed generalized neural attention recommendation model
Meanwhile, parameters of the training attention network and the embedded vectors of the articles can cause low convergence speed, generate self-adaptive effect and limit the improvement of model performance. So in the pre-training process, the item vector in the GNAS model is initialized using the item embedding vector trained by the FISM instead of the random initialization; since the FISM model does not involve optimization of attention weights, it can learn directly to the more representative capability of the item vector. By the arrangement, the convergence speed of the model can be increased, and the training of the attention network and other parameters is greatly improved. Since the model embeds the vectors using items pre-trained with FISM, other parameters that need to be learned are initialized with a Gaussian distribution.
The specific method for optimizing the generalized neural attention recommendation model GNAS comprises the following steps:
establishing an objective function of the model, as shown in the following formula:
wherein sigma is sigmoid function, and the purpose is to predict the result
Is limited to (0, 1), R
+ and R
-Representing positive example set of user-interacted article composition and negative example set of user-non-interacted article composition, the sum of the positive example set and the negative example set is training example number N, theta represents all training parameters, including p
i、q
j、
h
TW, b, λ is control L
2The degree of regularization to prevent overfitting hyper-parameters;
in order to minimize the objective function, a self-adaptive gradient algorithm Adagarad is adopted to automatically adjust the learning rate of parameters in training; if the falling gradient is large, the learning speed decays faster. For each positive case (u, i), a proportion of negative cases are randomly drawn to pair with in no observed interaction. An appropriate negative sampling rate has a positive effect on the model performance. In accordance with the setting of NAIS, the present embodiment sets the negative case number to 4.
In the embodiment, the generalized neural attention recommendation model GNAS established by the invention is experimentally verified through two real article data sets Movielens and Pinterest-20. The performance of the model was judged by two recommendations, Hit Ratio (HR) and Normalized counted relative Gain (NDCG). These two indicators have been widely used to evaluate Top-K recommendations and information retrieval. HR @10 may be interpreted as a recall-based metric that represents the percentage of successfully recommended users (i.e., the positive case appears in the top 10), while NDCG @10 is a measure that takes into account the predicted location of the positive case (the positive case is in the top 10), with a larger value for both indicators indicating better performance.
In fig. 2, when the size of the embedding vector obtained by the sparse vector of the user and the item passing through the embedding layer is set to 16, the scores of the GNAS model and the NAIS model of the present invention on the two evaluation indexes are shown in fig. 2. In the experiment, we run 100 epochs with the three models GNAS, NAIS and FISM until convergence and make the results of the last 50 epochs figure 2.
As is clear from fig. 2, the performance of the GNAS model of the present invention is far superior to the NAIS model alone, which demonstrates the effectiveness of the combination of the depth model and the breadth model to model user preferences. Specifically, on the movilens dataset, the GNAS model of the present invention improved the scores of both indices to 70.88% and 42.69% compared to the NAIS scores of 69.70% and 41.94% on HR and NDCG. The accuracy of the NAIS model has improved significantly in the recommendation task. In addition, the performance of the GNAS model is improved on a non-sparse data set more greatly than on a sparse data set, so that the GNAS model is more suitable for a dense data set. The scores of the GNAS model provided by the inventor on two indexes are far higher than the scores of the GNAS model on the FISM model, so that the great advantages of the integrated recommendation model in the aspects of recommendation accuracy and interpretability are fully proved, and the necessity of adding attention is reflected.
This example also shows the performance comparison of the GNAS model of the present invention with other novel recommendations as shown in table 1. Some of these models are based on embedding. For fairness, the embedding size is uniformly set to 16.
TABLE 1 comparison of GNAS and base methods for performance on HR @10 and NDCG @10 indices at an embedding size of 16
TABLE 2 training time for each round of model
As can be seen from table 1, the GNAS model achieves the highest score on both indices. Especially on non-sparse datasets MovieLens, which benefits from GMF enhanced memory of user-item interactions. While emphasizing the necessity of applying the integration model to the recommended tasks. At the same time, the performance of attention-based models (such as NAIS and GNAS) on these two data sets is significantly better than other recommended methods. In addition, the performance of GNAS is superior to that of NAIS, reflecting the effectiveness of designing integration models in attention networks. Due to the different ways of representing users, the performance of the user-based methods (MF, MLP) is inferior to the commodity-based methods (NAIS, GNAS).
The training time for each epoch of the GNAS model and the base model is also given in this example, as shown in table 2. Training time is not shown because the other models are implemented in JAVA, not in Tensorflow. The latter two models take longer due to the attention mechanism. As can be seen from table 2, the GNAS model of the present invention achieves a significant improvement in performance at a smaller time cost than NAIS. This is reasonable because generalized matrix factoring can simply and efficiently capture low-order associations between users and items.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions and scope of the present invention as defined in the appended claims.