Marcelo Matheus Gauy
Marcelo Matheus Gauy
Thank you for your answer. I have inspected my commands and found out that I did not realize the pretrained weights file was zipped. In order to get it to...
To further explain how your model will be used: first we will do additional pretraining in Brazilian Portuguese speech data and test the new and perhaps improved model on standard...
Thanks for your comments and help. From what I understand, I have four options: 1) Pre-training from scratch with M2D on Brazilian Portuguese Speech and later fine-tuning on the specific...
A small addendum for others: to setup distributed mode I had to adapt the command line to be CUDA_VISIBLE_DEVICES=0,1 python3 -m torch.distributed.launch --nproc_per_node=2 train_audio.py ... Without adding torch.distributed.launch, the model...
Thanks for the answer. I will be analyzing/testing how to schedule the learning rate for option 4 then. That was a very helpful comment that would have taken me time...
By the way, one last question: do you have an intuition on what values to expect for the loss during pre-training? While the primary measure of performance will be the...
Thank you for your answer. So the losses I see seem to be more or less in line with what you observed, though it might be possible to do better...
Thank you for the logs. They will be helpful. Unfortunately, I have only exchanged words with other researchers who reported this issue. I have not found a paper documenting this...