In this page, you can find notes, errata corrige and additional pieces of information for the book “Machine Learning Algorithms“. Related posts and notes can be found in the section: Machine Learning Algorithms Addenda. All the errata and typos have been integrated into the second edition of the book.

Gitter chatroom: https://gitter.im/Machine-Learning-Algorithms/Lobby

**Page 25:**

The correct number of misclassified samples in the figure is respectively 3, 14, and 24.

**Page 59:**

The transformation matrix W for the PCA must be transposed:

**Page 97:**

As explained in the previous chapters, it’s almost always a good practice normalizing the dataset. In this way, it becomes zero-centered and in the linear expression, it’s possible to avoid the use of bias. Otherwise, it’s necessary to rewrite the expression as:

Both w and b are parameters to learn.

**Page 100:**

The left part of the cross-entropy formula is wrong because its arguments are the two distributions. The right one is:

**Page 103:**

Addendum: In the “stochastic” gradient descent, the batch size is often set equal to 1. It means that a weight update is performed after every sample is presented. However, there are many papers and books where the attribute “stochastic” is referred to every mini-batch size.

**Page 219:**

The dendrogram must be cust at threshold slight below 30.

**Pages 238 and 268:**

The singular value decomposition is intended without the computation of full matrices and therefore it’s limited to the t principal singular values and vectors. The correct formula is:

**Other resources:**

ML Algorithms addendum: Mutual information in classification tasks