Question about reproducing Fig.1
Hi! Thanks for the great work!
Been playing with the code today and trying to reproduce Figure 1 in the paper, and here's what I got.
I noticed that the activations after rotation have a larger norm, which does not match the case in Figure 1.
I'm wondering if this is the case or maybe I missed something.
Thanks!
Here is my code.
Thanks @xinghaow99
I think the whole point is about scaling and you can try to scale the distribution using a single number (maybe divide them with the maximum value).
Thank you for your response. @sashkboos
I get the idea now. Just to make sure that incoherent processing does affect the norm of the matrix, right? And did you apply any scaling when plotting Fig.1?
@xinghaow99
Yes. We only care about the dynamic range during quantization.