mackamann

Results 10 comments of mackamann

@btrude could you possibly share the changes you made to get 5b_lyrics working, like removing the dependence on ddp and deleting the optimizer state? We're at pytorch 1.10, not sure...

@btrude thanks for the info, much appreciated! I haven't had much luck signing up with cloud compute providers, they seem to be very picky about who they'll let in, esp...

thanks for that! I took a shot at it earlier and am still struggling, I'll try your approach, this what I have (unfinished) with copious amounts of help from chatgpt...

Thanks @btrude. I'm fortunate that a cloud provider finally accepted me... after a lot of setting things up and writing scripts I got the model to start training and it...

Hehe, I did that and overdid it a bit and ended up over 80G. I also tried yanking out DDP thinking that might be the culprit (that was a losing...

I don't think you are off, it seems way too high... I will try again removing DDP, I got pretty far, but then out of the blue the training started...

very true, and thanks for all your help! if you have any other wisdom to share, I'm all ears... you seem to have chewed on the codebase a LOT. :)

Bah, the auto sample @ 12k iterations caused memory to spike and it it OOM'ed when it tried to resume. :( Gonna have to disable that.

@btrude when you were finetraining 5b/5b_lyrics, did you also finetune the upsamplers and then do annealing?

makes sense... I'm sooooooo close to this working, I've trained 15k steps, ported over the finetuning sampling notebook, was able to sample, and the resulting .wavs were all same, mostly...