Francesco Ballerini

Results 3 comments of Francesco Ballerini

Thanks @samuela, I understand that the code is a generalization of the MLP with no bias case, but still: 1. If the `moveaxis`-`reshape`-`@` operation corresponded to the Frobenius inner product...

Ok, but let us consider the MLP-with-no bias case. The way the paper models weight matching as an LAP is In other words, it computes `A` as (1) What the...

Ok, the role of `moveaxis` is clear, and the computation matches the formula in the paper for an MLP with no biases. On the other hand, the `reshape((n, -1))` (extending...