Alex Shaw

Results 2 issues of Alex Shaw

Has anyone been able to finetune any of the models larger than 7b successfully? I'm training on 8 A100s with 80GB of RAM each which is more than enough space....

It is not obvious what the "official" codebase of a paper is. Can we give a more formal definition? In our `tb tasks debug` check, agents often find valid repositories...