pytorch-tutorial icon indicating copy to clipboard operation
pytorch-tutorial copied to clipboard

[image caption] Test error when encoder(image)

Open apeterswu opened this issue 7 years ago • 2 comments

Hi,

I met the problem when I tried to decode a caption for images from val2014 dataset, for example, the picture is COCO_val2014_000000007888.jpg The error happened when processing feature = encoder(image)

RuntimeError: Given groups=1, weight[64, 3, 7, 7], so expected input[1, 1, 224, 224] to have 3 channels, but got 1 channels instead

Is there anything wrong for the encoder processing? Or any transformation need to be done for the picture? I use the sample.py script. Thanks.

apeterswu avatar Dec 28 '18 09:12 apeterswu

same problem, have you solved it ?

OswaldoBornemann avatar Jan 29 '19 09:01 OswaldoBornemann

I got the same problem.. and i solved it.. When you load image, make sure that the image has three color channel (RGB) because it might be gray scale. So you should convert it. An example is below:

Here is my code,

from PIL import Image

img = Image.open(image_path).convert('RGB')

mdhasanai avatar Mar 09 '19 08:03 mdhasanai