DingHing
DingHing
The trend of the loss curve in the BYOL method is always decreasing, or does it decrease first and then increase? Because as I understand it, initially the target_network and...
DeepLabV3+, the backbone can only use Xception, and other backbones cannot be selected 
代码中实现的,是不是多卡并行时,每个GPU的queue独立更新?,我感觉应该像moco一样,先keys = concat_all_gather(keys),再统一进行更新,是不是更合理一点?
I would like to raise a GitHub issue to ask when the LangMem section will be added.