get state dict OOM

Open WallE-Chang opened this issue 2 years ago • 1 comments

I train llama 13 in 8 3090 with lora. Model can be forwarded and backwarded. But when model get state dict, gpu is OOM.

May 25 '23 08:05 WallE-Chang

Seems like the trainer is saving the entire model as opposed to saving just the LoRA modules. I suspect that doing the latter will resolve this.

Aug 05 '23 17:08 edwardjhu