Regarding input image resolution

Open ariharasudhanm opened this issue 1 year ago • 0 comments

I am trying to train using my own dataset and the input image resolution is 512x512. When I tried to feed the image of dimension( 3,512,512) that is just stacking the grayscaled image to make it 3 dimensions, it throws error like

RuntimeError: Given groups=1, weight of size [32, 1, 3, 3], expected input[1, 3, 256, 256] to have 1 channels, but got 3 channels instead

Which i understand that network expects single dimension like 1,512,512 but this throws transformation errors like random rotate and random flips during the data loader process.

Could you please provide more information about the input image dimension how this to be handled?

Thank you.

May 08 '24 14:05 ariharasudhanm