How does batch parallel training actually work?

Open Tenceto opened this issue 7 years ago • 0 comments

I was reading the documentation for Somoclu and it is stated that in order to make the algorithm parallelizable, a batch training mode has to be followed. The equation to update the weights is given in the 4th page of the document.

I don't understand why that formulation works. Shouldn't the factor (x- w) be used instead of just x to achieve convergence? What troubles me is that past values of the weights are not used in the update rule.

Does that equation make sense? If it does, why?

Dec 18 '18 01:12 Tenceto