Does encoding support multi-task losses?
@zhanghang1989, Hi, thanks for your nice work. Does encoding support multi-task losses? When using a single GPU, my model returns a dictionary with each key/value indicating a specific task loss.
I tried the encoding.parallel.DataParallelModel, and the return is a list of dictionary. The length of the list is equal to the number of devices. Then I apply torch.nn.parallel._functions.Gather to the output list to integrate the losses from different devices in a unified GPU device. But the experimental results are not good as thetorch.nn.DataParallelwith the same batchsize.
So Is there something I missed?
In addition, I didn't use the encoding.parallel.DataParallelCriterion successfully, for this issue 'function' object has no attribute 'children'
If you do not use DataParallelCriterion , you should use PyTorch build-in DataParall
Both DataParallelCriterion and PyTorch build-in DataParall require a torch.module, however, the losses computation in my case is a function instead of a torch.module. So I just use the torch.nn.parallel._functions.Gather in the build-in DataParall. But the performance of the trained model is on par with the one without syncBN.