Results 4 issues of cw

In the decoder position embedding matrix, the size of first dim is the number of patches + 1, as the 1 for ViT's cls_token. But when embedding the position for...

My server has multi gpus, when I change '.cuda()' to '.cuda(1)', exception occured: cupy.cuda.driver.CUDADriverError: CUDA_ERROR_INVALID_HANDLE: invalid resource handle I also add 'os.environ["CUDA_VISIBLE_DEVICES"] = 1', but it doesn't work. Can anyone...

I was confused about why use BCE to compute the loss of 'x' & 'y', could it be MSE? https://github.com/BobLiu20/YOLOv3_PyTorch/blob/c6b483743598b5f64d520d81e7e5f47ba936d4c9/nets/yolo_loss.py#L55

Appreciate this excellent work! But I'm confused about the training target and loss function. According to the paper, training target is to recover the masked area of source image with...