ml-ane-transformers
ml-ane-transformers copied to clipboard
flip order of weight+bias application in LayerNormANE
Hi, I'm attempting to duplicate the pytorch LayerNorm functionality, and the formula that pytorch uses is clearly (out * weight) + bias, which does not match the code in LayerNormANE.
So I changed it for my use case, and thought I'd open a PR in case this is in fact a bug.
However.. looking at https://github.com/apple/ml-ane-transformers/commit/4b37184a506364b9c262413ab62610317f2d02c3, it looks like there is some history and/or legacy reasons for the order being this way, so feel free to reject if I'm missing something :)