Machine Learning Algorithms: Errata Corrige and Additional Notes

In this page, you can find notes, errata corrige and additional pieces of information for the book “Machine Learning Algorithms“. Related posts and notes can be found the section: Machine Learning Algorithms Addenda.


Page 59:

The transformation matrix W for the PCA must be transposed:


Page 97:

As explained in the previous chapters, it’s almost always a good practice normalizing the dataset. In this way, it becomes zero-centered and in the linear expression, it’s possible to avoid the use of bias. Otherwise, it’s necessary to rewrite the expression as:

Both w and b are parameters to learn.


Page 100:

The left part of the cross-entropy formula is wrong because its arguments are the two distributions. The right one is:


Page 103:

Addendum: In the “stochastic” gradient descent, the batch size is often set equal to 1. It means that a weight update is performed after every sample is presented. However, there are many papers and books where the attribute “stochastic” is referred to every mini-batch size.


Pages 238 and 268:

The singular value decomposition is intended without the computation of full matrices and therefore it’s limited to the t principal singular values and vectors. The correct formula is:


Other resources:

ML Algorithms Addendum: Hebbian Learning

ML Algorithms addendum: Mutual information in classification tasks