jbm
jbm
Did either of you work this out? I can't get past the above error (and I can't really figure out why it's being thrown anyway).
I seem to be having H100 specific problems as well. Is this still potentially an incompatibility? In my case I'm frequently timing out with thread deadlock when running slurm. Just...
> You can have a look at this to run manually the jobs without Slurm: https://github.com/facebookresearch/dora#multi-node-training-without-slurm > > You won't be able to use the `dora grid` command use, although...
@burstMembrane, did you find a good solution for batch processing? I have 8 GPUs and want to extract a bunch of embeddings as quickly as possible I noticed the "batch_size"...
Thanks, yes, I actually realized there was something similar I could do, in just chunking my data into my GPU-count chunks (8) and having a separate serial process for each...
I'm having a similar problem (with dora launch on slurm, in my case). Will disabling it actually allow training to progress? I mean, if there's actual deadlock somewhere, isn't it...
Another note on my situation is that the GPU utilization shoots straight up to 100% on all GPUs (2 nodes, 8xH100 each).
Just adding a bit more info, I managed to at least get to an _attempt_ to load `larger_clap_music` using this config: ``` conditioners: description: model: clap clap: # based on...
Okay, I can load `larger_clap_music` using the ClapModel (and ClapProcessor) from Huggingface, but not in Audiocraft. I see that Audiocraft is based on CLAP from the Laion repo... Does anybody...
I worked out a way around loading the HF weights. Now what I'm wondering about is how to configure a text prompt for running test generations during training. My goal...