CLOVER icon indicating copy to clipboard operation
CLOVER copied to clipboard

Vision encoder output dimension does not match

Open yuaoze opened this issue 1 year ago • 15 comments

Hi, thanks for your excellent work! I'm trying to run bash eval_calvin.sh. When running to FeedbackPolicy/models/policy.py, there is an issue where the shape of the vision_x input to vision_encoder is 192 * 192, which does not match the model size of 224 * 224. So I interpolated vision_x to 224 * 224, and the shape of output by vision_encoder is 8 * 768, which does not match the dimension of the rearrange operation.vision_x = rearrange(vision_x, "(b T) d h w -> b T (h w) d", b=b, T=T)

yuaoze avatar Oct 18 '24 07:10 yuaoze

Thanks for your interests in our work! We modify the default input size of VC1-Base model (from 224 to 192) in its corresponding config file. Just a small tweak to the config will let you use our evaluation scripts effectively.

Further updates are welcome if it fails to solve your issue. 😃

retsuh-bqw avatar Oct 18 '24 10:10 retsuh-bqw

Thanks for your interests in our work! We modify the default input size of VC1-Base model (from 224 to 192) in its corresponding config file. Just a small tweak to the config will let you use our evaluation scripts effectively.

Further updates are welcome if it fails to solve your issue. 😃

Hi, I followed your advice and modified the config file of VC1-Base model, but error still occurred. Here is the details. image

yuaoze avatar Oct 21 '24 01:10 yuaoze

Thanks for your interests in our work! We modify the default input size of VC1-Base model (from 224 to 192) in its corresponding config file. Just a small tweak to the config will let you use our evaluation scripts effectively. Further updates are welcome if it fails to solve your issue. 😃

Hi, I followed your advice and modified the config file of VC1-Base model, but error still occurred. Here is the details. image

I solved this issue by specified output_size: 192 under "transform" in config file But output of vision_encoder is shape of 8 * 768, which can not match the dimension of the rearrange operation.vision_x = rearrange(vision_x, "(b T) d h w -> b T (h w) d", b=b, T=T) Can you give me some advice?

yuaoze avatar Oct 21 '24 03:10 yuaoze

But output of vision_encoder is shape of 8 * 768, which can not match the dimension of the rearrange operation.vision_x = rearrange(vision_x, "(b T) d h w -> b T (h w) d", b=b, T=T) Can you give me some advice?

My bad. You should also set use_cls to False in the config file. Then the encoder will return all feature tokens.

retsuh-bqw avatar Oct 21 '24 04:10 retsuh-bqw

Hello! I met the same problem. After I set img_size to 192 and use_cls to False, the error still occurred: AssertionError("Input image height (224) doesn't match model (192)."). Can you give me more advice?

hkz103 avatar Oct 28 '24 11:10 hkz103

Hello! I met the same problem. After I set img_size to 192 and use_cls to False, the error still occurred: AssertionError("Input image height (224) doesn't match model (192)."). Can you give me more advice?

Is it because the sanity check in the load_model function (line 26 - 29) of VC-1? You may change the function as following:

def load_model(
    model,
    transform,
    metadata=None,
    checkpoint_dict=None,
):
    if checkpoint_dict is not None:
        msg = model.load_state_dict(checkpoint_dict)
        log.warning(msg)

    return model

retsuh-bqw avatar Oct 29 '24 04:10 retsuh-bqw

Hello! I met the same problem. After I set img_size to 192 and use_cls to False, the error still occurred: AssertionError("Input image height (224) doesn't match model (192)."). Can you give me more advice?

Is it because the sanity check in the load_model function (line 26 - 29) of VC-1? You may change the function as following:

def load_model(
    model,
    transform,
    metadata=None,
    checkpoint_dict=None,
):
    if checkpoint_dict is not None:
        msg = model.load_state_dict(checkpoint_dict)
        log.warning(msg)

    return model

It works! But I met a new problem: bug

hkz103 avatar Oct 30 '24 07:10 hkz103

It works! But I met a new problem: bug

It seems to be an issue within CALVIN. Is your CALVIN env properly installed?

retsuh-bqw avatar Oct 30 '24 07:10 retsuh-bqw

微信图片_20241030164904 Hi, I run `bash eval_calvin.sh`, but `failed to EGL with glad.`, Do you know how to solve this?

gouyinghong avatar Oct 30 '24 08:10 gouyinghong

It works! But I met a new problem: bug

It seems to be an issue within CALVIN. Is your CALVIN env properly installed?

You are right. I didn't properly install CALVIN. However, the packages uesd in CALVIN and CLOVER seem contradictory. Can you provide a requirements.txt?

hkz103 avatar Oct 30 '24 10:10 hkz103

You are right. I didn't properly install CALVIN. However, the packages uesd in CALVIN and CLOVER seem contradictory. Can you provide a requirements.txt?

There is a provided requirements.txt at visual_planner/requirements.txt. What packages conflicts are you getting exactly?

retsuh-bqw avatar Oct 30 '24 10:10 retsuh-bqw

You are right. I didn't properly install CALVIN. However, the packages uesd in CALVIN and CLOVER seem contradictory. Can you provide a requirements.txt?

There is a provided requirements.txt at visual_planner/requirements.txt. What packages conflicts are you getting exactly?

Now I met the problem of "Cannot load URDF file" again. And the packages conflicts are listed below. Can you give me more advice? Thanks for your help! problem

hkz103 avatar Oct 31 '24 05:10 hkz103

Now I met the problem of "Cannot load URDF file" again. And the packages conflicts are listed below. Can you give me more advice? Thanks for your help! problem

You can try to downgrade your networkx to 2.2. I think the other packages are fine.

retsuh-bqw avatar Oct 31 '24 05:10 retsuh-bqw

Now I met the problem of "Cannot load URDF file" again. And the packages conflicts are listed below. Can you give me more advice? Thanks for your help! problem

You can try to downgrade your networkx to 2.2. I think the other packages are fine.

When using networkx2.2,AttributeError"module 'numpy' has no attribute 'int'." is reported, because the high version of numpy no longer uses int and networkx2.2 may use int in numpy.

hkz103 avatar Oct 31 '24 06:10 hkz103

When using networkx2.2,AttributeError"module 'numpy' has no attribute 'int'." is reported, because the high version of numpy no longer uses int and networkx2.2 may use int in numpy.

You may try to downgrade the numpy as well. I'll update relavant information in a new Troubleshooting section.

retsuh-bqw avatar Oct 31 '24 12:10 retsuh-bqw