starcoder icon indicating copy to clipboard operation
starcoder copied to clipboard

What is the recommended GPU configuration to chat fine tune the full sequence length of 8192?

Open mathav95raj opened this issue 2 years ago • 2 comments

Even with a NVIDIA A100 80 GB GPU, I am not able to fine tune the model on full sequence length of 8192. I was not able to fine tune the full precision model with this configuration. With 8 bit quantised model and applying LORA I could go only upto 6000 sequence length (probably a little bit higher too but the full capacity of 8192 is giving cuda OOM). image

mathav95raj avatar Jul 17 '23 13:07 mathav95raj

Even with a NVIDIA A100 80 GB GPU, I am not able to fine tune the model on full sequence length of 8192. I was not able to fine tune the full precision model with this configuration. With 8 bit quantised model and applying LORA I could go only upto 6000 sequence length (probably a little bit higher too but the full capacity of 8192 is giving cuda OOM). image

when the length is smaller than 8192,Do you use one A100 80GB gpu to finetune the model ,it can be work?

bigmancomeon avatar Oct 08 '23 10:10 bigmancomeon

Yes it is working

mathavraj avatar Oct 20 '23 11:10 mathavraj