[image caption] Test error when encoder(image)
Hi,
I met the problem when I tried to decode a caption for images from val2014 dataset, for example, the picture is COCO_val2014_000000007888.jpg
The error happened when processing feature = encoder(image)
RuntimeError: Given groups=1, weight[64, 3, 7, 7], so expected input[1, 1, 224, 224] to have 3 channels, but got 1 channels instead
Is there anything wrong for the encoder processing? Or any transformation need to be done for the picture? I use the sample.py script. Thanks.
same problem, have you solved it ?
I got the same problem.. and i solved it.. When you load image, make sure that the image has three color channel (RGB) because it might be gray scale. So you should convert it. An example is below:
Here is my code,
from PIL import Image
img = Image.open(image_path).convert('RGB')