Julian Minder
Julian Minder
Hmm I would say it happened between and . My username is "resoundingcarpet" if that helps.
Should you be on a system that has only NFS mounted drives you can fall back to the `/dev/shm` mount. It should be there on most linux distros and is...
Hi, I've encountered a similar issue (I'm the collaborator of @Butanium). What I believe happens is that if you don't call `.value`, the whole computation graph is sometimes still kept...
I observe the same issue. It might be related to #570. There seems to be quite a large difference in logits between the two models. ```python from matplotlib import pyplot...
Made a few more investigations. Seems like differences stem from both attn and mlp, (haven't appended the plot but the embedding matrix output is equal). ```python hf_model_nnsight = NNsight(hf_model) with...
@bryce13950 thanks, i already started digging a bit deeper here. Seems like the weights are slightly different, even though an equality test doesn't show this. Would be great to have...