JaejinCho
JaejinCho
Thank you @gpucce for your answer! :) I think I did not explain well about the batch size difference so please disregard it for now. I may need to do...
> There is also a weird issue of increase in vocab size depending on how we add the pad token. > > Method 1: > > `from transformers import LlamaTokenizer,...
https://github.com/enhuiz/vall-e/issues/22
> `python -m torch.distributed.launch --nproc_per_node 2 -m vall_e.train yaml=config/test/ar.yml` > > This worked for me @eschmidbauer How does this command work if you want to use multiple nodes w/ more...
> > How long would the training take, for example in the 16x V100 case? > > It takes 4 days with 16x 32G V100. I did not train it...
Thanks for the interesting work! I am also interested in getting the data as well. It would be great if you could share how to request :).