Alex Shaw
Results
2
issues of
Alex Shaw
Has anyone been able to finetune any of the models larger than 7b successfully? I'm training on 8 A100s with 80GB of RAM each which is more than enough space....
It is not obvious what the "official" codebase of a paper is. Can we give a more formal definition? In our `tb tasks debug` check, agents often find valid repositories...