• Blog
  • Podcasts
  • Books
  • Resume / CV
  • Bonaccorso’s Law
  • Essays
  • Contact
  • Testimonials
  • Disclaimer

Giuseppe Bonaccorso

Artificial Intelligence – Machine Learning – Data Science

  • Blog
  • Podcasts
  • Books
  • Resume / CV
  • Bonaccorso’s Law
  • Essays
  • Contact
  • Testimonials
  • Disclaimer

A model-free collaborative recommendation system in 20 lines of Python code

09/13/2017 Artificial Intelligence Generic Machine Learning Machine Learning Algorithms Addenda Python Scikit-Learn No Comments

Model-free collaborative filtering is a “lightweight” approach to recommendation systems. It’s always based on the implicit “collaboration” (in terms of ratings) among users, but it is computed in-memory without the usage of complex algorithms like ALS (Alternating Least Squares) that can be executed in parallel environment (like Spark).

If we assume to have N users and M products, it’s possible to define a user-preference sparse matrix:

Each element represents the rating given by the user i to the item j (0 means that no rating has been provided). To face the differences belonging to many different contexts, it’s useful to define two kinds of ratings (or feedbacks):

  • Explicit feedback: an actual rating (like 3 stars or 8 out of 10)
  • Implicit feedback: a page view, song play, move play and so forth

The different kind of feedback determines also the best metric to compare users. In fact, once the user preference matrix has been populated, it’s necessary to compute the affinity matrix based on a similarity metric (the most common is the Euclidean one):

This matrix is built considering the rows of Upref. In particular, Upref[i] represents the preference vector (over all products) for the user i.

When the feedback is explicit, it can be useful to keep using the Euclidean metric, but when it’s implicit, it can a different meaning and it can be better to pick another metric function. For example, if we use a binary encoding for the feedback (0 means no interaction and 1 a single or multiple interaction), we could be interested in considering two users similar when they have the same amount of interactions (page views, song/movie plays, …), therefore the Hamming distance can be more appropriate.

In this implementation, I’ve used Scikit-Learn for computing the pairwise distances, but there are also other interesting solutions like Crab (http://muricoca.github.io/crab/). The source code is also available in this GIST.

Once the list of top neighbors has been determined, it’s possible to suggest the top-k products selecting them from the union of the most rated “neighbor” products:

The predicted rating for a suggested item can be obtained as the weighted average of the neighborhood. For example, it’s possible to determine a weigh vector using a softmax function:

Alpha is a normalization parameter introduced to avoid an overflow when the distances are too large. The weight vector is referred to the subset of suggested items. The average rating is obtained as:

Of course, this is only a possibility, and the reader can employ different techniques to compute the expected rating. For example, it’s possible to use a radial basis function with a low variance to penalize the contributions given by the most distant elements. At the same time, it’s possible to implement different affinities for the same task, evaluating the difference in the suggested products. A recommender isn’t seldom based on a “off-the-shelf” solution because of the intrinsic diversities in several business contexts, therefore I always suggest to try various alternatives before making the final decision.

See also:

SVD Recommendations using Tensorflow – Giuseppe Bonaccorso

Recommendation system based on the user-item matrix factorization have become more and more important thanks to powerful and distributable algorithms like ALS, but sometimes the number of users and/or items is not so huge and the computation can be done using directly a SVD (Singular Value Decomposition) algorithm.

Share:

  • Click to share on Twitter (Opens in new window)
  • Click to share on Facebook (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Pocket (Opens in new window)
  • Click to share on Tumblr (Opens in new window)
  • Click to share on Reddit (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to share on Skype (Opens in new window)
  • Click to share on WhatsApp (Opens in new window)
  • Click to share on Telegram (Opens in new window)
  • Click to email this to a friend (Opens in new window)
  • Click to print (Opens in new window)

You can also be interested in these articles:

machine learningpythonrecommendationsscikit-learn

Artificial Intelligence is a matter of Language

Quickprop: an almost forgotten neural training algorithm

0 thoughts on “A model-free collaborative recommendation system in 20 lines of Python code”

  1. Jayaraj Chanku
    02/14/2018 at 18:01

    Great information. Thanks for sharing this amazing guidelines. This article really helped me to gather what I was searching for. Thank you once again.

    Reply

Leave a Reply Cancel reply

Follow Me

  • linkedin
  • twitter
  • facebook
  • googlescholar
  • youtube
  • github
  • amazon
  • medium

Search articles

Latest blog posts

  • Mastering Machine Learning Algorithms Second Edition 02/03/2020
  • EphMrA 2019 Switzerland one day meeting 08/30/2019
  • Machine Learning Algorithms – Second Edition 08/28/2018
  • Recommendations and User-Profiling from Implicit Feedbacks 07/10/2018
  • Are recommendations really helpful? A brief non-technical discussion 06/29/2018

Subscribe to this blog

Join 2,199 other subscribers

Follow me on Twitter

My Tweets
Copyright © Giuseppe Bonaccorso. All Rights Reserved
Proudly powered by WordPress | Theme: Doo by ThemeVS.
loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.