interactdiffusion Inference Error

When I use the command python inference_batch.py --batch_size 1 --folder generated_output --seed 489 --scheduled-sampling 1.0 --half, unexpected error occurs:

RuntimeError: Error(s) in loading state_dict for FrozenCLIPEmbedder:
        Unexpected key(s) in state_dict: "transformer.text_model.embeddings.position_ids".

It origins from the code: text_encoder = instantiate_from_config(config['text_encoder']).to(device).eval() in inference.py, Incidentally, I used interact-diffusion-v1.pth. Is there anything wrong in this checkpoint? Thank you.

Jun 12 '24 07:06 Hammour-steak

New version of transformers library used a different name for transformer.text_model.embeddings.position_ids. Using an older version, pip install transformers==4.30.1 works for me.

Jun 12 '24 07:06 jiuntian

Thank you, it works now.

Jun 13 '24 08:06 Hammour-steak

112 It seems that a little difference from what the paper reports when I used the pretrained model v1.0

Jun 19 '24 05:06 Hammour-steak

It could be attributed to randomness in random initial noises.

Jun 26 '24 06:06 jiuntian