Jack Shi Wei Lun comments

Results 27 comments of


                                            Jack Shi Wei Lun

Custom fine-tuned DeepSeek coder model unable to be quantized to Fp16

Thanks for your response - however, where do I find the vocab file in that huggingface? I assume you meant the vocab.json file?

Custom fine-tuned DeepSeek coder model unable to be quantized to Fp16

the files are not broken. This is an issue for other people as well. In fact, you dont have to quantize a custom deepseek model to get this error. If...

Custom fine-tuned DeepSeek coder model unable to be quantized to Fp16

hi all, i am not investigating this issue anymore. I am using another model. Hope someone can fix this / look into this @cmp-nct

[FEATURE REQUEST] does autotrain support multiple GPUs?

@abhishekkrthakur , the multi-GPU from autotrain-advanced doesn't seem to work. could you please advise? setting this in my job script: ``` export CUDA_VISIBLE_DEVICES=0,1 ``` does not seem to split the...

[FEATURE REQUEST] does autotrain support multiple GPUs?

any updates to this? running on multiple GPUs doesnt seem to be supported despite stating that it is. It just loads the task into each GPU.

[FEATURE REQUEST] does autotrain support multiple GPUs?

could you kindly let me know how exactly do I do that? This is not written anywhere in the documentation for autotrain-advanced, and is unclear in how I should execute...

[FEATURE REQUEST] does autotrain support multiple GPUs?

does not work. @mrticker please confirm too

[FEATURE REQUEST] does autotrain support multiple GPUs?

I assume you simply use ``` export CUDA_VISIBLE_DEVICES=0,1,2,3,4.... ``` at the start of ur job script? that's all required right? otherwise, could you kindly let me know what exactly did...

[FEATURE REQUEST] does autotrain support multiple GPUs?

@abhishekkrthakur if a 80GB A100 works without OOM, does it mean that 2x 40GB A100 will work well too? Because currently, 80GB A100 works for me, but 2x 40GB A100...

[BUG] Why does increasing model_max_length result in fine-tuning not working?

@abhishekkrthakur able to give insight on this? seems like a major bug.. does it mean increasing the model_max_length (or block_size) while keeping the data length the same will affect the...