Machine Learning Algorithms – Second Edition

The second edition (fully revised, extended, and updated) of Machine Learning Algorithms has been published today and will be soon available through all channels. From the back cover: Machine learning has gained tremendous popularity for its powerful and fast predictions through large datasets. However, the true forces behind its powerful…

Recommendations and User-Profiling from Implicit Feedbacks

Recommendations and Feedbacks The vast majority of B2C services are quickly discovering the strategic importance of solid recommendation engines to improve the conversion rates and an establish a stronger fidelity with the customers. The most common strategies are based [3] on the segmentation of users according to their personal features…

Artificial Intelligence is a matter of Language

“The limits of my language means the limits of my world.” (L. Wittgenstein)   When Jacques Lacan proposed his psychoanalytical theory based on the influence of language on human beings, many auditors remained initially astonished. Is language an actual limitation? In the popular culture, it isn’t. It cannot be! But,…

Twitter Sentiment Analysis with Gensim Word2Vec and Keras Convolutional Networks

Fork Word2Vec (https://code.google.com/archive/p/word2vec/) offers a very interesting alternative to classical NLP based on term-frequency matrices. In particular, as each word is embedded into a high-dimensional vector, it’s possible to consider a sentence like a sequence of points that determine an implicit geometry. For this reason, the idea of considering 1D…

BBC News classification algorithm comparison

Fork BBC News dataset (available for download in Insight Project Resources website) is made up of 2225 newslines classified into 5 categories (Politics, Sport, Entertainment, Tech, Business) and, similarly to Reuters-21578, it can be adopted in order to test both the efficacy and the efficiency of different classification strategies. In the repository: https://github.com/giuseppebonaccorso/bbc_news_classification_comparison,…

Reuters-21578 text classification with Gensim and Keras

Fork Reuters-21578 is a collection of about 20K news-lines (see reference for more information, downloads and copyright notice), structured using SGML and categorized with 672 labels. They are diveded into five main categories: Topics Places People Organizations Exchanges However, most of them are unused and, looking at the distribution, it’s…