In this page, you can find notes, errata corrige and additional pieces of information for the book “Machine Learning Algorithms“:
Pag. 238 and 268:
The singular value decomposition is intended without the computation of full matrices and therefore it’s limited to the t principal singular values and vectors. The correct formula is:
Addendum: In the “stochastic” gradient descent, the batch size is always 1. It means that a weight update is performed after every sample is presented.