Introduction
E-commerce develops strongly and gives users a good experience not only with complete information about products but also conveniences for customers such as providing items by catalog or comparing among products. Items selected according to certain criteria will be recommended to customers with relevant characteristics, such as age, gender, interests, or place of residence. Storing customer and product information is a challenging task because the number of customers and products is very large, increasing rapidly and diversifying sources of information collected. With a huge size database of users and items, the speed of access will limit the effectiveness of e-commerce. The collaborative filtering (CF) algorithm calculates the similarity between users [1], [2], without retrieving their personal information. This algorithm evaluates the similarity between customers based on the large number of items they have interacted with such as purchasing, rating, viewing, commenting. And then, the system will recommend to a target customer items in the interactive list of other customers who have a highly similarity with them. In these models, the correlation between the user and the item is recorded by an interaction matrix, which can be implicit or explicit. If the data are implicit, the matrix element will store whether the user is interested in or shopping for the item. If the data are explicit, the matrix element value represents the user’s rating of the item. The similarity between two users is the distance between two vectors representing them, each of which contains information about that user’s interactions with all items.
The explosion in the number of users and items leads to a huge matrix size, which will take up a large amount of memory for storage as well as computing resources for operation on matrices. The matrix factorization (MF) technique can decompose the interaction matrix into two or more matrices of smaller size without losing the original useful information. Singular value decomposition (SVD) algorithm generalizes the eigen decomposition of a square matrix which has an ortho normal eigen basis to any size matrix, and finally, the feature matrix is found with smaller size and minimum loss. However, most of the elements in these matrices have the value of zero, because almost users interested only a certain number of items and has no interaction with the rest of the item list. It made matrix sparse, they need to be compressed to both reduce the amount of memory needed as well as facilitate the computation. By matrix transformations, information is collected into embedding matrices and can be efficiently processed using TensorFlow and Keras libraries [3]. As the final step of the computation, the embedding matrices can be converted back to the original interaction matrix size for the front-end layers to make recommendations to the user.
Besides exploiting user and item interaction information, users’ social relationships are also essential because they show the user’s influence in real life. A shopping recommendation from a friend will have a greater impact on a user’s decision than other advertising sources. Today’s social networking platforms such as Foursquare, Facebook, Gowalla also provide a database of users and their friendship relationships. Mining these facts will enrich the information and support not only the recommendation process but also specific user groups [4], [5].
Both the interactions between users and items and the social relationships between users can be graphically represented and mined using graph neural network (GNN) [6]. In a graph, neighboring nodes in some hops show that they are related. Implicit information could be discovered by travel though the graph and look for nearly users who have the most common items reference. These high-order had been found in Neural graph collaborative filtering model (NGCF) [7]. Each matrix will also be transformed into embedded matrices that are significantly reduced in dimensions compared to the original matrix but still contain all the original information. The embedding matrices are decomposed to capture the influence signals and then combine them into a more informative embedding matrix [8]. The learning process can be repeated many times to reach the target accuracy, in the iterations propagated embedding matrices are used to capture high-order connectivity in the item and user interaction graph.
In this publication, we propose a GCN model that could analyst and synthesize from three sources of information: the product and user interaction matrix, the influence weighted matrix between users and the matrix of social friendships derived from social network platforms. We also implement the model and conduct experiments on well-known data sets and compare the results with the base line models. This article will be shown the next sections. Chapter 2 will summarize the research work that are the foundation of the topic and the latest techniques. Chapter 3 propose model in which we will show how to preprocess data sets, the organization of neural layers, and the signal aggregator and decomposer. Chapter 4 presents the overall results and compares them with previous state-of–the-art models. Chapter 5 is our conclusions and some ideas for future research.
Literature Review
A. Collaborative Filtering
Recommender system in the first stage exploited features of all items as well as the preferences of users to find suitable items and give reasonable recommendations to users [9]. For example, a book can be categorized into comics, detective, historical, archaeology, fine arts, and others. With users, information such as age, gender, address of residence or education could also used to enrich the input data [10]. However, with a very large number of items and frequent additions, it is difficult to find out and give items attributes, the systems need additive filtering methods. Collaborative filtering approach algorithms were used to find similarity between users or items without attribute information from them [11].
Collaborative filtering can be implemented by memory-based models [12] or model-based models [13]. With memory-based model, the history of every user’s rating on items will be recorded in a rating matrix. There are many ways to define a rating scale, which can be an integer value from 1 to 5, or an implicit rating. In a space made up of user set, item set, and rating values, each user is represented by a feature vector
In the model-based collaborative filtering systems, the algorithms will search for patterns in the learning data to develop a model for future prediction [14]. The matrix that contains rating of users on items can be reduced by using MF techniques. In order to ensure accuracy of distance between users and items the rating matrix can be divided into user feature matrix and item feature matrix with smaller sizes [15]. The higher accuracy of user similarity calculating, the better recommendation prediction results.
The Graph convolutional matrix completion (GCMC) built a graph-based auto-encoder framework for matrix completion [16]. This model moved from matrix to graph by defining the main problem as prediction task on a bipartite graph. Graph construction also benefits from having additional sources of information such as friendships in social networks, which are also represented by graphs. The encoders traverse the graph and record signals from vertices, representing users and items, producing latent features of these nodes. The of collaborative filtering problems between rows (or columns) has now become about measuring the predicted edge weights between two objects, which present users or items.
B. Graph Convolution Networks
A graph is a type of data structure that uses a set of vertices
Graph Convolutional Network (GCN) is an iterative method of collecting information [22]. For example, when there is a similar item of interest to many of the target user’s friends, it should be recommended with a higher degree than other items. GraphSAGE [23] embedded inductive for each vertex of the graph and learned the topological structure of the graph as well as the effect of vertices on neighboring vertices. This method not only focus on feature-rich graphs but also make use of structural features of all vertices.
NGCF [7] generates hop-by-hop propagation classes in the input graph under the assumption that the effect of the users varies depending on the distance between the graph’s vertices. Attention signals from a user’s neighbors are received by the
C. Social Recommender System
The social counseling system has been developed since the emergence of online social platforms. In addition to user behavior, restrictions such as liking, sharing and commenting; Additional data from social networks is used to provide personalized headlines to users considering the influence of their friends [5], [25], [26]. Loads of information on social networks have influenced and made users of the same interests as their friends [27], [28]. ContextMF [29] recorded the social relational information in the recommendation data in a frame design that was collected from matrix coefficients and with regularization terms. This model has improved the accuracy of the recommendation results when the model only performs MF with the user and item relation matrix. The TrustSVD model [25] developed from SVD++ [30] incorporates the influence of friends on social relations as an implicit feedback add-on for an observing user. GCN usage recommendation models are also extended to include more user social relation information, such as GraphRec [31] which viewed a user in both importance matrix aspects. social system and in the matrix of interactions with items. The data influence between users was aggregated and conditioned by the Social Aggregation module during machine learning.
SocialLGN [32] designed deep models to capture the most distinctive characteristics of social and embedded networks and users of mixed embeddings, instead learning directly from their interaction data only users and items. Moreover, their model still uses GCN to study the reference of users whose information influence is affected through the process of social variables pervasive in social networks. User influence propagation also propagates through multiple logging layers, techniques used in NGCF [7] and Light GCN [24].
Technically, the process of enhancement between social relational data and user influence in shopping is a challenge because they change according to the time and nature of each kind of items. In the Socially-Aware Self-Supervised Tri-training (SEPT) model [33], three histogram encoders were created to get the times signal from the social relationship between users, the user relationship when interested in similar items and information shared information - augment the data between social information and shopping interaction information. The self-supervised learning process in the above model has both improved the effectiveness of recommendations.
Our Proposed Model
In this section, we present a proposed model to exploit the rating matrix and social relationship between users to make recommendations for any user in the future. The model will be presented with data processing module, GCN module and prediction layer with loss calculator. An overview on our model show as Figure 2.
A. Data Pre-Processing
The data sets are always recorded continuously and redundantly and included noise data that are users and items with very little interaction. Eliminating these signals will limit recommendations to items that are no longer available. Furthermore, the reduced data sets, smaller size but without losing valuable information, will decrease the memory size required, as well as cut down the computation time of the models. In our experiments, we apply a 10-core setting, which means that users with less than 10 interactions will be dropped at this processing step.
With the selected set of items, we also keep only the number of users with the highest interactions that are measured by the Jaccard distance between the set of items that a user interacts with and the set of all filtered items in the database. The ratio of number of users to number of items can be selected equal to the ratio of the original data set. The ratio of all survey data sets in the publication [7], [24]. We summarize this ratio in Table 1. Users with a low Jaccard measure were removed as outliers, and the social friendship matrix is being rebuilt accordingly.
After pre-processing the data set, we’ve got 3 matrices as input to the model, where matrix A presents interaction between users and items, matrix C contains the similarity degree between users and matrix S represents social friendship of users. The matrix S is an optional matrix, which could be skipped if a data set does not provide a real social relationship; or when the noise of social relation data is higher than the actual effect among users.
Algorithm 1 Filter Out Items With Less Than 10 Interactions
R
for all item
if item
end if
end for
for all user
end for
return
1) The Interaction Matrix A
The user-item interaction data set can be provided as an implicit or an explicit. Implicit data records interactions such as item view, item purchase, mouse click, or like item as a binary value of 0 or 1, while explicit data specifies a user rating an item with a rating value in a certain range, for example an integer between 0 and 5. In this article, we consider the interactions between the user and the item are implicit recorded and expressed by a bipartite graph \begin{align*} R_{i,j}=\begin{cases} \displaystyle 1 & \text {if user $u_{i}$refers to item $i_{j}$}\\ \displaystyle 0 & \text {otherwise} \end{cases} \tag{1}\end{align*}
Interactive filtering by GCN will capture users’ characteristics into user embedding while features of items will be learned into item embedding. In order for these two embedding matrices to be updated at the same time to facilitate propagation, we design the input matrix to be a Laplacian matrix showed in (2). If the matrix \begin{align*} A = \begin{bmatrix}0 & R \\ R^{\top} & 0 \end{bmatrix} \tag{2}\end{align*}
Because of convenience in calculation the embeddings in next layer in our graph convolution operations [7], the symmetrically normalized matrix \begin{equation*} \widetilde {A} = D^{ -\frac {1}{2}}AD^{ -\frac {1}{2}} \tag{3}\end{equation*}
2) The Users Correlation Matrix C
The similarity between each pair of customers in the database depends on how many common items they interact with. Because the greater the number of items of common interest, the higher the influence between them. So that, in the publication [34], a weighted user influence matrix
However, the matrix \begin{align*} W_{U}&= |I_{i}| + |I_{j}| - |I_{i} \cap I_{j}| = |I_{i} \cup I_{j}| \tag{4}\\ W_{I}&=|I_{i} \cap I_{j}| \tag{5}\end{align*}
Therefore, element-wise product
Furthermore, GCN-based models collect the collaborative signals from the high-order connectivity graph. The matrix
The conversion process of the resulting matrix
Summarized all above ideas, the matrix C, which represents the degree of similarity between the behavior of the users, is implemented as the (6) with \begin{equation*} C = f(W_{U} \odot {W_{I}}^{-1}) \tag{6}\end{equation*}
3) The Social Friendship Matrix S
Because the data of social relation is not always provided in a data set, it is an optional part of our recommendation system. In the next part of the experiment, we will present the comparative results with social relations and without it. Some data sets and publications review the user
In recent studies, the matrix \begin{equation*} A_{s} = (RR^{\top}) \odot S \tag{7}\end{equation*}
We keep our matrix
B. Graph Convolution Network Layers
1) Transformation Weight Matrices and Activation Function
The NGCF model represents the most advanced models that use GCN for mining and recommendation. NGCF is designed with feature transformation matrices as well as a nonlinear activation function like other GCN standards [35]. Then the non-linear activation will incorporate embedding in the output and smoothing the values so as not to introduce a biased value in overall result. The LeakyReLU function is commonly used in GCN models [7], [34].
However, developing the NGCF model on data with little description, where both user entries and each other are stored with only the valid ID, does not bring any benefit to the feature learning process. On the contrary, computer resources are spent a lot on storage and operations on digital matrices. The use of weighted transformation matrices and activation function has been discussed in [24]. In this publication, the LightGCN model eliminates all the weighted transformation
SocialLGN-f removes all feature transformation matrices
andW_{1}, W_{2} .W_{3} SocialLGN-n eliminates the non-linear activation function
.\sigma (.) SocialLGN-fn eliminates both the feature transformation matrix
and the nonlinear activation functionW_{i} .\sigma (.)
In all experiments, we apply the same hyperparameters with publications NGCF and LightGCN, including learning rate, regularization, dropout rate, Gowalla and Yelp2018 data. The results show in Figure 4 with Gowalla data set and Figure 5 with Yelp data set.
When removing either feature transformation matrices
From the above presentation, we determine that eliminating both the feature transformation matrices and the nonlinear activation function is appropriate for the data sets considered in this article.
2) Embedding Layers
Follow the model in publication [36] used two vectors for user’s latent and item’s latent, to define as a result are users embeddings and items embeddings, we also define users embeddings and items embeddings as \begin{equation*} E^{(0)}=E_{U}^{(0)} \parallel E_{I}^{(0)}=[e_{u_{1}}^{(0)},\dotso,e_{u_{n}}^{(0)},e_{i_{1}}^{(0)},\dotso,e_{i_{m}}^{(0)}] \tag{8}\end{equation*}
3) Propagation Process
In order to capture signals from three input matrices, every \begin{align*} e_{ua}^{(k+1)} &=\sum _{i\in {\mathcal {N}}_{u}^{A}}^{}\frac {1}{ \sqrt {| {\mathcal {N}}_{u}^{A} ||{\mathcal {N}}_{i}^{A}| } }e_{i}^{(k)} \\ e_{s}^{(k+1)} &=\sum _{s\in {\mathcal {N}}_{u}^{S}}^{}\frac {1}{ \sqrt {| {\mathcal {N}}_{u}^{S} ||{\mathcal {N}}_{s}^{S}| } }e_{u}^{(k)} \\ e_{c}^{(k+1)} &=\sum _{c\in {\mathcal {N}}_{u}^{C}}^{}\frac {1}{ \sqrt {| {\mathcal {N}}_{u}^{C} ||{\mathcal {N}}_{c}^{C}| } }e_{u}^{(k)} \\ e_{i}^{(k+1)} &=\sum _{u\in {\mathcal {N}}_{i}^{A}}^{}\frac {1}{ \sqrt {| {\mathcal {N}}_{i}^{A} ||{\mathcal {N}}_{u}^{A}| } }e_{u}^{(k)} \tag{9}\end{align*}
Then the \begin{align*} e_{u}^{(k+1)} =AGGREGATION_{k} \left ({e_{ua}^{(k+1)},\quad e_{c}^{(k+1)}, \quad e_{s}^{(k+1)} }\right) \tag{10}\end{align*}
After some loops of propagation process, we’ve got \begin{equation*} e_{u} = \frac {1}{K}\sum _{k=0}^{K}e_{u}^{(k)};\quad e_{i} = \frac {1}{K}\sum _{k=0}^{K}e_{i}^{(k)} \tag{11}\end{equation*}
Using equation (11), we takes the mean value of all embeddings at all layers, and that equation can be replaced with some other functions like maximum, median, weighted average. In order not to complicate the model and still ensure good performance in general, we use the mean function.
4) Prediction and Optimization
Based on the final embedding \begin{equation*} \widehat {y}_{ui}=e_{u}^{\top} e_{i} \tag{12}\end{equation*}
We build the loss function with Bayesian personalize ranking (BPR) for model to learn the parameters \begin{equation*} Loss_{BPR} = \sum _{\Omega ^{+}_{ui}}\sum _{\Omega ^{-}_{uj}} -\ln \sigma (\widehat {y}_{ui} - \widehat {y}_{uj}) + \lambda \parallel \Phi \parallel ^{2}_{2} \tag{13}\end{equation*}
5) Matrix Form
To facilitate the implementation of the model, we expressed the propagation process in the form of matrix in (14).\begin{align*} E^{(k+1)} &= E_{U}^{(k+1)} \parallel E_{I}^{(k+1)} \\ &= (E_{U_{A}}^{(k+1)} + E_{S}^{(k+1)} + E_{C}^{(k+1)}) \parallel E_{I}^{(k+1)} \\ &= (\widetilde {R}E_{I}^{(k-1)} + \widetilde {S}E_{U}^{(k-1)} + \widetilde {C}E_{U}^{(k-1)}) \parallel \widetilde {R^{\top} }E_{U}^{(k-1)} \tag{14}\end{align*}
\begin{align*} \widetilde {S} &= D_{S}^{-\frac {1}{2}}SD_{S}^{-\frac {1}{2}} \\ \widetilde {C} &= D_{C}^{-\frac {1}{2}}CD_{C}^{-\frac {1}{2}} \tag{15}\end{align*}
Results
A. Experimental Setting
We use setting 10-core with all data sets to ensure that each user has at least ten interactions. For each data set, we selected 20% of interactions which have the latest timestamp for testing set and the remaining (80% interactions) are for process of training. Summary of all data sets showed in Table 3. We configure our model with embedding fixed size to 64 for all models and the embedding parameters are initialized with the normalized weight method. The optimization process was done with Adam algorithm [38].
1) Data Sets
Gowalla: First published in [39], Gowalla is a web-based application that provides location-based social networking. Users can check-in and share public places. Gowalla also collects the friendship network of users and provides these data as an undirected graph.
Librarything: is a data from a book review website called Library Thing [40]. It is an online service to help people organize their books easily and people can communicate with each other.
Ciao: is an online shopping site that records users’ ratings of items with timestamps. Customers can rate an item with a score from 1 to 5. Users can also add others to their friend lists and build a social network [41].
Epinions: is a well-known consumer review site where users can rate items and add social friends to their trust lists [42]. The Ciao and Epinions data sets have been widely selected as benchmark data sets for social recommendations.
2) Evaluation Criteria
The evaluative measurement should be selected appropriately for each algorithm [43]. A commonly evaluating method is dividing a data set into a test set and a train set. The algorithm is applied on the train set to make predictions and evaluated these results on the test set. The difference between the learning result and the actual data value shows the accuracy of the algorithm of proposed model. This difference can be represented in mean absolute error (MAE) and root mean square error (RMSE). Besides accuracy, recall, scalability, learning time, memory consumption or interpretability are also an important criterion in evaluating the recommended system.
For implicit data, interactions between the user and the item are recorded binary, rather than rated as a specific value. Then, the algorithms will use the accuracy measure for the classification. The commonly used metrics are Precision and Recall [44], [45]. Precision is the ratio between correct predictions on the test set, and Recall is the sensitivity of the algorithm, or the proportion of relational assertions that have been retrieved from the test set.\begin{align*} Precision &= \frac {TP}{TP+FP} \\ Recall &= \frac {TP}{TP+FN} \tag{16}\end{align*}
Furthermore, Discounted Cumulative Gain score (DCG) [35] assumes that judges have assigned labels to each result and accumulates across the result vector a gain function \begin{equation*} NDGC_{u} = \frac {DCG_{u}}{CDG_{max}} \tag{17}\end{equation*}
3) Base Line Models
To evaluate our proposed model, we compare it with the following state-of-the-art models as base lines.
MF [15] is the matrix factorization model demonstrated in the Netflix pricing competition. The model outperforms classic nearest neighbor techniques for generating product recommendations and allows the inclusion of additional information such as implicit feedback, temporal effects, and confidence. Even though this model is outdated, we still put it in comparison to show the evolution of recommendation models.
Social MF [46] incorporates the trust information and its propagation into an MF model. The signals from direct friends are embedded in the final matrix.
Trust SVD [25] is a trust-based MF technique, is built on the SVD++ algorithm. This model handles the explicit and implicit influence of rated items, by incorporating reviews from trusted friends, to make prediction of items for an active user.
GCMC [16] assumes that making a future recommendation is a prediction of the association between the user and the item in the graph. Interaction data is represented by one bipartite graph with labeled edges denotes the observed ratings. The model then uses a graph auto encoding framework based on message passing on interactive graphs.
NGCF [7] is one of the most cutting-edge GCN models. It performs propagation operations on embeddings using a few iterations. High-order connectivity in the interactions graph is contained in the stacked embeddings on the output. The latent vectors contain the collaborative signal, which strengthens the model.
WiGCN [34] based on NGCF model, the WiGCN added a weighted matrix as an additional input. That matrix contains the influence of users on each other. It leads to more data-gathering propagation and increased recommendation performance.
LightGCN [24] in order to concentrate on the neighborhood aggregation element for CF, this model eliminates the weight matrices and the activation function. The users and items embeddings for the interaction graph are learned using linear propagation in this model. The weighted total of all learnt embeddings becomes the final embedding.
SocialLGN [32] concentrate on the CF’s neighborhood aggregation component. The users and items embeddings for the interaction graph are learned using linear propagation in this model. The weighted total of all learnt embeddings becomes the final embedding.
SEPT [33] from the user social information, this model augments the user data views with the user social information. And then the framework builds three graph encoders upon the augmented views and iteratively improves each encoder with self-supervision signals from other two encoders.
B. Overall Result Comparison
Summary of all results from all models on the processed data are displayed in the Table 4 with three values in each data set: precision, recall and ndcg measured in 20 items. In order from the top row down, the chemical process of recommender system models can be clearly identified, that is, the group of methods using GCN has better results than the group of classical methods such as MF or SVD. In the group of GCN methods, models that do not use the transformation matrices, such as LightGCN, give better results than the remaining models. When compared in the group of models that do not consider social relationships, that is MF, GCMC, NGCF, LightGCN, WiGCN, our model with interaction embedding (Our w/ interaction) still gives the highest results.
In each pair of models such as MF and Social MF, LightGCN and SocialLGN, LightGCN and SEPT, almost models that use social relationship give better results; This confirms that mining actual friend data are as important as mining the correlation between users through items of common interest. Our proposed model also efficiently accounts for social relationship data when embedding them in the propagation process.
C. Detailed Model Analysis
1) Effect of Social Relation on Overall Result
We compare each pair of models that are architecturally similar to each other in Table 5. In each row of comparison, one model exploits the users and items relationship data while the other model adds users and users relationship data. In a pair of models, the signal propagation and reception formulas all have a high degree of similarity, using (or not using) the weighted transformation matrices and non-linear activation functions. The parameters setting in the experiment are the same.
From MF to SocialMF: in the traditional MF model, the characteristics of each user have been obtained from the decomposition of the users - items relationship matrix. Meanwhile, the SocialMF model also extracts characteristics of a user’s direct friends to enrich information for that user. However, the SocialMF model did not deepen the indirect relationships between the user community, so it missed many valuable signals.
From LightGCN to SocialLGN: this is the only case where the results are worse across all measures. The process of combining signals from mining similarities between users with the social relationships of users is not in harmony, resulting in friend information not only reducing the accuracy of the recommendation results but also scattering the CF results.
From LightGCN to SEPT: although both SocialGCN and SEPT are models that extend from LightGCN with friendship information, it is clear that the SEPT model has optimized the embedding of the friendship matrix between users into the model and brings the benefits big improvement.
We also tested the influence of friendship data on the overall results by removing the S matrix in the model input and comparing the results between the model without the S matrix (Our model with only interaction data) and full model (Our model-all). We calculated the influence weights between users and clustering gathered those weights into each relationship level as shown in the Table 2. By doing that way, our model avoided the damage of SocialLGN that still captures valuable friend signals like in the SocialMF model and explores indirect relationships like other GCN models.
2) Time Performance Analysis
The time cost of our model lies in the interaction and social relation graph aggregation. Therefore, the total time complexity is acceptable in practice. We try to make the environment run the experiments as similar as possible each time, the average time results of each epoch compared to each data set on the LGC-based models are almost similar, the figures statistics are given in Table 6.
An extremely outstanding feature of our proposed model is its convergence speed. Not only does it have higher recommendation predictions than other models, but its time efficiency is also significantly shorter than LGC architecture-based models such as LightGCN, SocialLGN, and SEPT. Specifically, in the Gowalla data set, our model with only interaction data takes 490 epochs and Our model only takes 440 epochs, while the two models SEPT and LightGCN take 1090 and 1140 epochs, respectively. For the remaining two data sets, Ciao and Epinions, there is also a similar trend, which demonstrates very good performance of the model training process. We conclude that the user correlation matrix helps the model focus on important data, thereby accelerating the learning process of our model.
3) Positive Impact of User Interaction Matrix
Our model converges faster than the baseline models, shown in the Figure 6 and 7. In our previous work at WiGCN model, adding a weight matrix as input and generating strong attention signals for nodes in the graph during propagation. The user correlation weight matrix based on the set of shared items only needs to be calculated once using basic matrix operations and used for all embedding layers. Therefore, in this article, the user’s correlation matrix with carefully preprocessed
Training curves of LGC-based models, which are evaluated by training loss and testing recall@20 on Gowalla data set.
Training curves of LGC-based models, which are evaluated by training loss and testing recall@20 on LibraryThing data set.
D. Explain the Influence of Interaction Data and Social Network Friendship Data
One of our main research questions is the influence of interaction data and social friendship data when considered separately and when aggregated. We implemented the model in a modular form, which can enable or disable interactive embedding and social friend embedding, to make our model explainable. In Table 7 and 8, we compare 4 models: LightGCN (can be considered a baseline model without interaction matrix and social network friendship data); our model only with interaction data or social network friendship data; and our full model.
We conclude that for data sets related to geographical locations on social networking platforms, social friendship data create the main impact. It contributes largely to the increase in precision and recall in our model. For platforms where friendship network is just a secondary function, such as the Librarything data set in Table 8, which have social network just for private messaging, the interaction data plays a crucial role. Finally, when combining both sources of data, the best results are achieved.
Conclusion and Future Works
A. Conclusion
In this work, we have presented an effective model that includes data filtering to remove noisy interactions between users and items, evaluating the influence of the weight matrix and activation function on the result, and a method of consolidating data when the similarity of users comes from multiple information sources, as well as comparing the effective recommender system models presented recently.
Designing the input with multi-embedding makes them quickly capture interaction signals and significantly increase the efficiency of the model. Our model has inherited the most effective features from the previously introduced algorithms, which are using the high-order connectivity learning structure, removing the weight matrix, and eliminating the nonlinear activation function. We also evaluated the influence of social friendships on the recommendation process and pointed out some of the difficulties of using this data.
B. Future Work
During the process of conducting research, we proposed several issues that need to be further explored. That is group recommendation, which gives the results not only for a single user but for a whole group of friends. That recommendation model is useful in contexts where a group of friends go to the movies together, go to the same attraction, or a group of students take a course. Another problem is that items have been treated as individuals; they are not clustered or grouped into catalogs. This may have omitted a lot of information from the set of items in the final recommendation result. In fact, items are closely related when they belong to the same commodity group, when items are in geographically distant locations, or when the item is in movies in the same genre.