Poster-Presentation without proceedings
The 15th International Conference on Electronic Commerce and Web Technologies (EC-Web 2014), Munich, Germany
Introduction -- We developed a spreading activation based rating estimation method in the field of recommender systems. As introduced in our previous paper we represent the information in a directed hypergraph, similar to how a semantic network represents the information. The graph is heterogeneous and contains different kind of entities in its nodes and relations in its edges. To estimate a user's rating on a specific item, a modified version of the collaborative filtering measure, called recommendation spreading has been proposed. It replaces the original, Pearson correlation based user similarity to a spreading activation based similarity measure. We validated our approach on the MovieLens 1M dataset. Our evaluation shows that we achieved very similar MAE as collaborative filtering while increasing the coverage from 61.95% to 78.43%.<br> <br> Dataset -- GroupLens publishes 3 MovieLens datasets, MovieLens 100k, MovieLens 1M and MovieLens 10M. We selected MovieLens 1M, which is the richest in user and item attributes. The dataset contains gender, age, occupation and zip-code about users and genre and publication year of items. The dataset also contains a timestamped ratings data.<br> <br> Representation -- The information has been represented in a directed, labeled graph. The nodes represent the entities playing role in the recommendation scenario, the edges represent the relations between the entities. We introduced a node for each user and item. To represent attribute values, we introduced a node for each attribute value. For example we represent the movie genre ``Comedy" with a node and the occupation ``Lawyer" also with a node in the graph. To indicate that a user has a specific occupation an edge of type ``PersonOccupation" has been introduced between the corresponding node representing the user and the node representing the occupation. To store historical rating information in the graph, the edge type ``ItemRating" has been introduced, in which edge indicates that the user has rated the corresponding movie. The value of the rating to this ``ItemRating" edge has been assigned as an edge attribute.<br> <br> Calculation -- To calculate rating estimation for a specific user on a specific item, we start a spreading activation from the node representing the user. Activation relax has been set to 0.5. Activation relax decreases the activation of a node in each spreading step. Spreading relax has been set also to 0.5. To calculate the outgoing spreading value we multiple the activation of a node with the spreading relax parameter. The outgoing activation is then distributed between all edges belonging to the specific node. The spreading has been run with a step limit based termination criteria. The step limit has been set to 8.<br> <br> For each edge of type ``ItemRating" all the activation flown through the edge has been accumulated. The more parallel paths lead to the edge, the higher the accumulated activation. The higher the destination of the edge from the user, the lower the activation. To estimate the rating of a specific item, we take all the edges of type ``ItemRating" connected to the edge and use its rating value with the accumulated activation as its weight substituted in the collaborative filtering formula.<br> <br> Evaluation Method -- To evaluate our method we iterated through the historical ratings data of the dataset in timestamp order. The evaluation has been run on the first 10,002 ratings values. In each evaluation step the following steps have been repeated: ask for rating estimation from the recommender engine, record the rating error, add the edge representing the rating to the graph. <br> <br> Results -- We compared our method to the collaborative filtering formula. User similarity for collaborative filtering has been calculated with Pearson correlation. Our results show that the methods have a similar MAE. The MAE of our method is 0.16670, the MAE of collaborative filtering is 0.16673. While providing very similar quality estimations, our method has a higher coverage (78.43%) than collaborative filtering has (61.95%). By coverage we mean the number of cases the recommendation method could provide an estimation. Another important feature of our recommendation spreading method is that it can provide a significantly higher number of ratings in the cold start case, where less rating values are available in the graph. The coverage of recommendation spreading at step 1000 is 33.1%, where the coverage of collaborative filtering is 12%.<br> <br> Summary and Future Works -- An activation spreading based algorithm has been developed to estimate ratings in the field of recommender systems. Our method has been compared to collaborative filtering on the MovieLens 1M dataset. We represent the information in a directed graph, where the nodes represent heterogeneous information, like users, items and attribute values. To calculate rating estimations, the collaborative filtering formula has been modified. In our approach the weight of each rating value is the activation flown through the edge representing an already known rating. Our results show a significantly higher coverage with the same estimation quality. Our future plan is to refine the calculation methods, for example to introduce weights for edge types and train these weights based on user feedback.
Information and Communication Technology