Model-free collaborative filtering is a “lightweight” approach to recommendation systems. It’s always based on the implicit “collaboration” (in terms of ratings) among users, but it is computed in-memory without the usage of complex algorithms like ALS (Alternating Least Squares) that can be executed in parallel environment (like Spark).
If we assume to have N users and M products, it’s possible to define a user-preference sparse matrix:
Each element represents the rating given by the user i to the item j (0 means that no rating has been provided). To face the differences belonging to many different contexts, it’s useful to define two kinds of ratings (or feedbacks):
- Explicit feedback: an actual rating (like 3 stars or 8 out of 10)
- Implicit feedback: a page view, song play, move play and so forth
The different kind of feedback determines also the best metric to compare users. In fact, once the user preference matrix has been populated, it’s necessary to compute the affinity matrix based on a similarity metric (the most common is the Euclidean one):
This matrix is built considering the rows of Upref. In particular, Upref[i] represents the preference vector (over all products) for the user i.
When the feedback is explicit, it can be useful to keep using the Euclidean metric, but when it’s implicit, it can a different meaning and it can be better to pick another metric function. For example, if we use a binary encoding for the feedback (0 means no interaction and 1 a single or multiple interaction), we could be interested in considering two users similar when they have the same amount of interactions (page views, song/movie plays, …), therefore the Hamming distance can be more appropriate.
In this implementation, I’ve used Scikit-Learn for computing the pairwise distances, but there are also other interesting solutions like Crab (http://muricoca.github.io/crab/). The source code is also available in this GIST.
Once the list of top neighbors has been determined, it’s possible to suggest the top-k products selecting them from the union of the most rated “neighbor” products:
The predicted rating for a suggested item can be obtained as the weighted average of the neighborhood. For example, it’s possible to determine a weigh vector using a softmax function:
Alpha is a normalization parameter introduced to avoid an overflow when the distances are too large. The weight vector is referred to the subset of suggested items. The average rating is obtained as:
Of course, this is only a possibility, and the reader can employ different techniques to compute the expected rating. For example, it’s possible to use a radial basis function with a low variance to penalize the contributions given by the most distant elements. At the same time, it’s possible to implement different affinities for the same task, evaluating the difference in the suggested products. A recommender isn’t seldom based on a “off-the-shelf” solution because of the intrinsic diversities in several business contexts, therefore I always suggest to try various alternatives before making the final decision.
Recommendation system based on the user-item matrix factorization have become more and more important thanks to powerful and distributable algorithms like ALS, but sometimes the number of users and/or items is not so huge and the computation can be done using directly a SVD (Singular Value Decomposition) algorithm.