Expected tensor for argument #1 'input' to have the same type as tensor for argument #2 'rois'; but type torch.cuda.HalfTensor does not equal torch.cuda.FloatTensor
Getting this while training a Faster RCNN On training process
for epoch in range(num_epochs):
model.train()
i = 0
for imgs, annotations in data_loader:
i += 1
total_processed += 1
imgs = list(img.to(device) for img in imgs)
annotations = [{k: v.to(device) for k, v in t.items()} for t in annotations]
loss_dict = model(imgs, annotations)
losses = sum(loss for loss in loss_dict.values())
optimizer.zero_grad()
losses.backward()
with amp.scale_loss(losses, optimizer) as scaled_loss:
scaled_loss.backward()
optimizer.step()
I get RUNTIME Error
warnings.warn("The default behavior for interpolate/upsample with float scale_factor will change "
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-65-8fac2bdda8e5> in <module>()
15 # imgs = torch.as_tensor(imgs, dtype=torch.float32)
16 annotations = [{k: v.to(device) for k, v in t.items()} for t in annotations]
---> 17 loss_dict = model(new_imgs, annotations)
18 losses = sum(loss for loss in loss_dict.values())
19
6 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/torchvision/models/detection/generalized_rcnn.py in forward(self, images, targets)
69 features = OrderedDict([('0', features)])
70 proposals, proposal_losses = self.rpn(images, features, targets)
---> 71 detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
72 detections = self.transform.postprocess(detections, images.image_sizes, original_image_sizes)
73
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/torchvision/models/detection/roi_heads.py in forward(self, features, proposals, image_shapes, targets)
752 matched_idxs = None
753
--> 754 box_features = self.box_roi_pool(features, proposals, image_shapes)
755 box_features = self.box_head(box_features)
756 class_logits, box_regression = self.box_predictor(box_features)
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
548 result = self._slow_forward(*input, **kwargs)
549 else:
--> 550 result = self.forward(*input, **kwargs)
551 for hook in self._forward_hooks.values():
552 hook_result = hook(self, input, result)
/usr/local/lib/python3.6/dist-packages/torchvision/ops/poolers.py in forward(self, x, boxes, image_shapes)
194 output_size=self.output_size,
195 spatial_scale=scales[0],
--> 196 sampling_ratio=self.sampling_ratio
197 )
198
/usr/local/lib/python3.6/dist-packages/torchvision/ops/roi_align.py in roi_align(input, boxes, output_size, spatial_scale, sampling_ratio, aligned)
43 return torch.ops.torchvision.roi_align(input, rois, spatial_scale,
44 output_size[0], output_size[1],
---> 45 sampling_ratio, aligned)
46
47
RuntimeError: Expected tensor for argument #1 'input' to have the same type as tensor for argument #2 'rois'; but type torch.cuda.HalfTensor does not equal torch.cuda.FloatTensor (while checking arguments for ROIAlign_forward_cuda)```
Environment
CUDA used to build PyTorch: 10.1
OS: Ubuntu 18.04.3 LTS GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 CMake version: version 3.12.0
Python version: 3.6 Is CUDA available: Yes CUDA runtime version: 10.1.243 GPU models and configuration: GPU 0: Tesla P100-PCIE-16GB Nvidia driver version: 418.67 cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
Versions of relevant libraries: [pip3] numpy==1.18.4 [pip3] torch==1.5.0+cu101 [pip3] torchsummary==1.5.1 [pip3] torchtext==0.3.1 [pip3] torchvision==0.6.0+cu101 [conda] Could not collect
Did you get this error fixed? I am receiving the same runtime error. The model works perfectly when I run it on its own. I receive this runtime error only when I run it with another model simultaneously.
not yet. no answer from apex
I am also getting the same error. Is the error fixed?
I came across the same problem. Is there any solution way now?
RuntimeError: Expected tensor for argument #1 'grad_output' to have the same type as tensor for argument #2 'weight'; but type torch.cuda.HalfTensor does not equal torch.cuda.FloatTensor (while checking arguments for cudnn_convolution_backward_input)
I also meet this question
same problem but fixed with command .float().
This error referred to the tensor.dtype such as torch.float16 is the half tensor of tensor.float32!
using scope of with autocast(enabled=True): helped me.