Comments for Giuseppe Bonaccorso https://www.bonaccorso.eu Artificial Intelligence - Machine Learning - Data Science Fri, 07 Sep 2018 15:40:19 +0000 hourly 1 https://wordpress.org/?v=4.9.8 Comment on Twitter Sentiment Analysis with Gensim Word2Vec and Keras Convolutional Networks by Giuseppe Bonaccorso https://www.bonaccorso.eu/2017/08/07/twitter-sentiment-analysis-with-gensim-word2vec-and-keras-convolutional-networks/#comment-137 Fri, 07 Sep 2018 15:40:19 +0000 https://www.bonaccorso.eu/?p=1080#comment-137 Hi,
in a convolutional network, it doesn’t make sense talking about neurons. In this case, there 8 layers (separated by a dropout one) with 32 (3×1) kernels (with ELU activation), followed by 2 dense Tanh layers with 256 neurons and a softmax output layer with 2 units.

The number of layers can be analyzed in many ways:

  • Experience
  • Validation
  • Grid-search

In general, it’s helpful to start with a model with smaller models, checking the validation accuracy, overfitting, and so on, and making a decision (e.g. adding new layers, increasing or decreasing the number of units, adding regularization, dropout, batch normalization, …). The golden rule (derived from the Occam’s razor) is to try to find the smallest model which achieves the highest validation accuracy. An alternative (but more expensive) approach is based on a grid-search. In this case, a set of models based on different parameters are trained sequentially (or in parallel, if you have enough resources) and the optimal configuration (corresponding to the highest accuracy/smallest loss) is selected. Normally this approach requires more iterations because the initial grid is coarse-grained and it’s used to determine the sub-space where the optimal parameter set is located. Then, several zooms are performed in order to fine-tune the research.

]]>
Comment on Twitter Sentiment Analysis with Gensim Word2Vec and Keras Convolutional Networks by sah https://www.bonaccorso.eu/2017/08/07/twitter-sentiment-analysis-with-gensim-word2vec-and-keras-convolutional-networks/#comment-136 Fri, 07 Sep 2018 08:51:52 +0000 https://www.bonaccorso.eu/?p=1080#comment-136 nice post – I have a simple question –
In your architecture how many hidden layer did you use? and the number of neuron in each layer is 32?? am i right?

how can you know which number of layer would be beneficial for your model?
thanks alot

]]>
Comment on Twitter Sentiment Analysis with Gensim Word2Vec and Keras Convolutional Networks by negar https://www.bonaccorso.eu/2017/08/07/twitter-sentiment-analysis-with-gensim-word2vec-and-keras-convolutional-networks/#comment-135 Thu, 06 Sep 2018 17:37:20 +0000 https://www.bonaccorso.eu/?p=1080#comment-135 thanks a lot

]]>
Comment on Twitter Sentiment Analysis with Gensim Word2Vec and Keras Convolutional Networks by Giuseppe Bonaccorso https://www.bonaccorso.eu/2017/08/07/twitter-sentiment-analysis-with-gensim-word2vec-and-keras-convolutional-networks/#comment-134 Thu, 06 Sep 2018 15:54:49 +0000 https://www.bonaccorso.eu/?p=1080#comment-134 It’s a deep convolutional network (1D)

]]>
Comment on Twitter Sentiment Analysis with Gensim Word2Vec and Keras Convolutional Networks by negar https://www.bonaccorso.eu/2017/08/07/twitter-sentiment-analysis-with-gensim-word2vec-and-keras-convolutional-networks/#comment-133 Wed, 05 Sep 2018 21:56:30 +0000 https://www.bonaccorso.eu/?p=1080#comment-133 Hi. Thanks a lot for your nice explanation- Just have a question since i’m a beginner – What classifier you use for your model?Liner? probability? and why? Please explain it thank you

]]>
Comment on Recommendations and User-Profiling from Implicit Feedbacks by Giuseppe Bonaccorso https://www.bonaccorso.eu/2018/07/10/recommendations-user-profiling-implicit-feedbacks/#comment-132 Wed, 05 Sep 2018 15:47:05 +0000 https://www.bonaccorso.eu/?p=2136#comment-132 Thanks, Antoine.
The word vector selector has not been detailed because I’m planning to post a complete working example. However, the idea is based on a “pseudo-attention” mechanism implemented with a simple MLP with a softmax output (the input length is fixed and the sentences are padded or truncated). Each value represents the probability of a specific word to be representative of a context. The network is trained with labeled examples and, thanks to word-vectors, is also very robust to synonyms.

Each training couple is made up of (wv1, wv2, …, wnN) -> (p1, p2, …, pN), where the probabilities are non-zero only for the representative vectors (an alternative approach is based on sigmoids, but the training speed was slower and the final accuracy worse). E.g. “The restaurant is nice but the food is quite bad” -> Word vectors -> Targets: “restaurant” and “food” (so the softmax output would be 0.0, 0.0, 0.5, …, 0.5, 0.0, 0.0). As we want to perform a “local” sentiment analysis, each “peak” in the softmax is surrounded by a set of additional words. Hence, in this case, for example, we want a peak for “restaurant” and a smaller value for “nice” (e.g. 0.3, 0.2).

Once this submodel has been trained, we froze it and trained the convolutional network. I hope this very brief explanation could be helpful.

]]>
Comment on Recommendations and User-Profiling from Implicit Feedbacks by antoine https://www.bonaccorso.eu/2018/07/10/recommendations-user-profiling-implicit-feedbacks/#comment-131 Wed, 05 Sep 2018 12:30:40 +0000 https://www.bonaccorso.eu/?p=2136#comment-131 Hi Giuseppe,

Your work looks so interesting.

I’d love to get a bit more insight about how the “word vector selector” works exactly thought.

Thanks
Antoine

]]>
Comment on Twitter Sentiment Analysis with Gensim Word2Vec and Keras Convolutional Networks by Giuseppe Bonaccorso https://www.bonaccorso.eu/2017/08/07/twitter-sentiment-analysis-with-gensim-word2vec-and-keras-convolutional-networks/#comment-130 Fri, 31 Aug 2018 16:32:13 +0000 https://www.bonaccorso.eu/?p=1080#comment-130 From your error, I suppose you’re feeding the labels (which should be one-hot encoded for a cross-entropy loss, so the shape should be (7254, num classes)) as input to the convolutional layer. Honestly, I don’t know how to help you. The input shape should be (num samples, max length, vector size), hence check if X has such a shape before splitting. Moreover, as the output is binary, Y should be (num samples, 2). For example, positive (1.0, 0.0) or negative (0.0, 1.0).

]]>
Comment on Twitter Sentiment Analysis with Gensim Word2Vec and Keras Convolutional Networks by Giuseppe Bonaccorso https://www.bonaccorso.eu/2017/08/07/twitter-sentiment-analysis-with-gensim-word2vec-and-keras-convolutional-networks/#comment-129 Fri, 31 Aug 2018 16:26:53 +0000 https://www.bonaccorso.eu/?p=1080#comment-129 The subdivision into 2 or 3 blocks is a choice with a specific purpose. If the dataset is assumed to be sampled from a specific data generating process, we want to train a model using a subset representing the original distribution and validating it using another set of samples (drawn from the same process) that have never been used for training. In some cases, it’s helpful to have a test set which is employed for the hyperparameter tuning and the architectural choices and a “final” validation set, that is employed only for a pure non-biased evaluation. In both scenarios (2 or 3), the goal is the same and the only very important condition is that all 2/3 sets must be drawn from the same distribution. Of course, feel free to split into 3 sets if you prefer this strategy.

]]>
Comment on Twitter Sentiment Analysis with Gensim Word2Vec and Keras Convolutional Networks by fer https://www.bonaccorso.eu/2017/08/07/twitter-sentiment-analysis-with-gensim-word2vec-and-keras-convolutional-networks/#comment-125 Thu, 30 Aug 2018 07:10:55 +0000 https://www.bonaccorso.eu/?p=1080#comment-125 Excuse me why don’t you separate your corpus into 3 parts as training testing and validation??

]]>