DEIM
DEIM copied to clipboard
Can I perform multi-GPU training? Why does the training stop early when I set the total number of epochs to 300 with 4 GPUs?"
Can I perform multi-GPU training? Why does the training stop early when I set the total number of epochs to 300 with 4 GPUs?"
Hi ! I am also training DEIM in Multi-GPU setup (4x Nvidia T4). When you say that the training stops early, are you getting the following error ? If yes, did you managed to fix it ?
[rank1]: File "DEIM/src/deimkit/engine/deim/box_ops.py", line 53, in generalized_box_iou
[rank1]: assert (boxes1[:, 2:] >= boxes1[:, :2]).all()
[rank1]: AssertionError
Moreover, the comments in the file box_ops.py says:
# degenerate boxes gives inf / nan results
# so do an early check
So I believe that the failing assert cannot be removed.