Jack Shi Wei Lun
Jack Shi Wei Lun
Thanks for your response - however, where do I find the vocab file in that huggingface? I assume you meant the vocab.json file?
the files are not broken. This is an issue for other people as well. In fact, you dont have to quantize a custom deepseek model to get this error. If...
hi all, i am not investigating this issue anymore. I am using another model. Hope someone can fix this / look into this @cmp-nct
@abhishekkrthakur , the multi-GPU from autotrain-advanced doesn't seem to work. could you please advise? setting this in my job script: ``` export CUDA_VISIBLE_DEVICES=0,1 ``` does not seem to split the...
any updates to this? running on multiple GPUs doesnt seem to be supported despite stating that it is. It just loads the task into each GPU.
could you kindly let me know how exactly do I do that? This is not written anywhere in the documentation for autotrain-advanced, and is unclear in how I should execute...
does not work. @mrticker please confirm too
I assume you simply use ``` export CUDA_VISIBLE_DEVICES=0,1,2,3,4.... ``` at the start of ur job script? that's all required right? otherwise, could you kindly let me know what exactly did...
@abhishekkrthakur if a 80GB A100 works without OOM, does it mean that 2x 40GB A100 will work well too? Because currently, 80GB A100 works for me, but 2x 40GB A100...
@abhishekkrthakur able to give insight on this? seems like a major bug.. does it mean increasing the model_max_length (or block_size) while keeping the data length the same will affect the...