'synth_set' used twice
Hi, I was looking through the code for the DCASE'24 Task 4 baseline system and noticed the following lines in the file train_pretrained.py:
strong_full_set = torch.utils.data.ConcatDataset([strong_set, synth_set])
tot_train_data = [maestro_real_train, synth_set, strong_full_set, weak_set, unlabeled_set]
train_dataset = torch.utils.data.ConcatDataset(tot_train_data)
According to this, 'synth_set' is used twice. Is there a specific reason for this?
Hi,
Thanks for the question, I think it has been done only to "upsample" the amount of synthetic training data during each epoch. It is very similar to having 12 for synthetic training data as in the past recipe but it has been split into 6 and 6+strong.
In general the recipe is very sensitive to the batch size and the proportions of each dataset. This is for sure not optimal but worked well in our experiments.
@JanekEbb do you know more maybe ?
Thanks for the explanation!
Actually, I'd say that leads to strong_set (strong Audioset portion) being underrepresented in the training. Currently strong_set makes only 6/64*3470/(10000+3470)≈2.6% of the training data if I am not wrong. We may wanna fix that.
Thanks for pointing that out Florian!
After many tries it seems to me that the best configuration is this one with the strong and synth concatenated. The strong labels do not seem to help in my case.