yoinked
yoinked
Maybe you could check what GPU that is enabled, if it even is possible, to filter which should get it
> This is a fix I've seen floated around on threads for a while now and it's a curious one. Enabling cuDNN shouldn't have any effect as cuDNN is enabled...
this *might* take more time on startup; since it loops over every card and loops over a list of turing cards and checks the name; but its better for the...
> Why are the 20xx cards in the list though? They work fine now, and judging by other replies this change would just tank performance for no reason. some 20xx...
p.s. if google offered any bigger TPU's for TRC; i could train retnet-3b (the point at which retnet is better than regular transformers), but as of now; theres retnet_base (small)...
as far as i know, no official pretrained models were released by microsoft; but the training code is on the torchscale repo, so thats how i am training the models
ah, understood, i'll try to get a good checkpoint; but for now, i assume i can close this and reopen when it finishes training
oops
https://huggingface.co/parsee-mizuhashi/retnet/tree/main trained it on 1m steps, loss is around `4.2`, hope this is good enough for some inference code
> My recommendation would be to put the model on the hub following [this tutorial](https://huggingface.co/docs/transformers/custom_models), which will help having a working code without going trough the hassle of all the...