RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Filtering the images containing characters which are not in opt.character Filtering the images whose label is longer than opt.batch_max_length
dataset_root: all_data opt.select_data: ['all_data'] opt.batch_ratio: ['1']
dataset_root: all_data dataset: all_data all_data/en_sample sub-directory: /en_sample num samples: 882 all_data/rec\test sub-directory: /rec\test num samples: 0 all_data/rec\train sub-directory: /rec\train num samples: 0 all_data/rec\val sub-directory: /rec\val num samples: 0 num total samples of all_data: 882 x 1.0 (total_data_usage_ratio) = 882 num samples of all_data per batch: 10 x 1.0 (batch_ratio) = 10
Total_batch_size: 10 = 10
dataset_root: all_data/en_sample dataset: / all_data/en_sample/ sub-directory: /. num samples: 882
...
continue to train, start_iter: 300000 training time: 11.559250354766846 Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...
RuntimeError Traceback (most recent call last) Cell In[6], line 2 1 opt = get_config("config_files/en_fine_tunning_config.yaml") ----> 2 train(opt, amp=False)
File c:\Users\mengfoong\Desktop\Train_Docling_2\EasyOCR-Trainer\train.py:233, in train(opt, show_number, amp)
230 model.eval()
231 with torch.no_grad():
232 valid_loss, current_accuracy, current_norm_ED, preds, confidence_score, labels,
--> 233 infer_time, length_of_data = validation(model, criterion, valid_loader, converter, opt, device)
234 model.train()
235 print(infer_time, length_of_data)
File c:\Users\mengfoong\Desktop\Train_Docling_2\EasyOCR-Trainer\test_1.py:45, in validation(model, criterion, evaluation_loader, converter, opt, device) 43 preds_size = torch.IntTensor([preds.size(1)] * batch_size) 44 # permute 'preds' to use CTCloss format ---> 45 cost = criterion(preds.log_softmax(2).permute(1, 0, 2), text_for_loss, preds_size, length_for_loss) 47 if opt.decode == 'greedy': 48 # Select max probabilty (greedy decoding) then decode index to character 49 _, preds_index = preds.max(2)
File c:\Users\mengfoong\Desktop\Train_Docling_2\venv\Lib\site-packages\torch\nn\modules\module.py:1739, in Module._wrapped_call_impl(self, *args, **kwargs) 1737 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1738 else: ... 3085 _Reduction.get_enum(reduction), 3086 zero_infinity, 3087 )
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and CPU!
I'm facing this error when I run the cell, anyone can share how you resolved this ?
I'm facing the same problem. Have you fixed it?
yes, same, im abel to run it on cpu, but not gpu
Has anyone solved this problem? Can you share it?
it's been a long time, but I remember I solved by following instructions of using cuda in pytorch that supports GPU. rmb to install driver from cuda also.
It been a long time, I will try my best to help.
I also got this problem after training around an hour, super time wasting.
But that problem is easy to solve, just move tensors to same device i.e.
# Calculate evaluation loss for CTC decoder.
preds_size = torch.IntTensor([preds.size(1)] * batch_size).to(device)
# permute 'preds' to use CTCloss format
cost = criterion(preds.log_softmax(2).permute(1, 0, 2), text_for_loss.to(device), preds_size, length_for_loss.to(device))