Recommendation system based on the user-item matrix factorization have become more and more important thanks to powerful and distributable algorithms like ALS, but sometimes the number of users and/or items is not so huge and the computation can be done using directly a SVD (Singular Value Decomposition) algorithm. In this post, we’re going to discuss a very simple implementation based on Tensorflow.
The first element to define is the user-item matrix. Normally this matrix associates each user with each product through a rating (0 means that no rating has been provided) or explicit feedback. This approach is effective whenever it’s possible to ask the user to rate the items, but there are many situations where this isn’t possible. In all those cases, an implicit feedback can be employed. For example:
- For a movie/video, it’s possible to consider how many times the user watched it and the average duration
- For a song, it’s useful to consider the number of times it was played
- For a blog/magazine, a valid measure is provided by the time spent on each single page
Many measures can be used effectively; the only condition is that they must be time-invariant metric functions. If there are n users and m products, the user-item matrix is:
The SVD (limiting the computation to the p greatest singular values) allows to define the matrix as:
The columns of U contain the left singular vector, while the rows of transposed V contain the right singular vectors. Sigma is a diagonal matrix containing the singular values.
In this model, we are assuming that a rating is determined by a set k user latent factors and k item ones. A latent factor is a variable that we cannot directly observe but it’s possible to estimate. This concept is very important because the theoretical foundation of the whole model is built upon it. Just to understand, we can say that a user rates a generic product according to the gender, age range, education, interests and so forth. At the same time, a product can be virtually split into different components that contribute the the overall rating: brand, features, price, etc. Even if we expect those latent factors, we cannot easily determine them, therefore we use a factorization method. The original user-item matrix will be split into two factors: a user-factor matrix and a factor-product matrix. The multiplication removes the latent factors, rebuilding the original user-item matrix.
If we assume to have k factors, we can truncate the SVD:
In this way, we are selecting the top k singular values and the corresponding singular vectors. Predictions can be obtained after defining two auxiliary matrices:
The predicted rating for user i and product j is:
where the first term is the average rating per user. However, in this example we are interested in finding the top k suggestions for all users, therefore we are not going to consider it as it’s a constant after defining the user.
In the example, we assume to have 5000 users and 2000 products. The number of factors (k) is rather high (500) and can be dramatically reduced. The ratings are bounded between 1 and 5 (0 meaning no rating). Every user has rated 500 random products and we want to suggest the top 10 items. In this case, I’m not controlling if the item has already been rated, in fact there are situations (like Youtube) where the recommendations contain video frequently seen and others (like Netflix) where it’s more useful to suggest new (never-seen) items.
The computational time is about 5 seconds on a i7 CPU.