visual-chatgpt RuntimeError: Error(s) in loading state_dict for ControlLDM: Unexpected key(s) in state

I am getting an error: RuntimeError: Error(s) in loading state_dict for ControlLDM: Unexpected key(s) in state_dict: "logvar".

It might be related to the fact that the relative directory it is trying to load .safetensor models from seems unclear (its neither the stable-diffusion installation directory nor /visual-chatgpt/--subdir--/--model---.safetensor nor /visual-chatgpt/ControlNet/ <-- I tried making symlinks there, to no avail), snip:

Initializing VisualChatGPT Initializing StableDiffusionInpaint to cuda:0 text_encoder\model.safetensors not found ## relevant part here Initializing ImageCaptioning to cuda:0 Initializing T2I to cuda:0 unet\diffusion_pytorch_model.safetensors not found ## relevant part here Direct detect canny. Initialize the canny2image model. ControlLDM: Running in eps-prediction mode

...Finally, resulting in [after some models and xformers are loaded successfully as also indicated by VRAM usage]:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ │ │ visual_chatgpt.py:947 in │ │ │ │ 944 │ │ return state, state, txt + ' ' + image_filename + ' ' │ │ 945 │ │ 946 if name == 'main': │ │ ❱ 947 │ bot = ConversationBot() │ │ 948 │ with gr.Blocks(css="#chatbot .overflow-y-auto{height:500px}") as demo: │ │ 949 │ │ chatbot = gr.Chatbot(elem_id="chatbot", label="Visual ChatGPT") │ │ 950 │ │ state = gr.State([]) │ │ visual_chatgpt.py:815 in init │ │ │ │ 812 │ │ self.i2t = ImageCaptioning(device="cuda:0") │ │ 813 │ │ self.t2i = T2I(device="cuda:0") │ │ 814 │ │ self.image2canny = image2canny() │ │ ❱ 815 │ │ self.canny2image = canny2image(device="cuda:0") │ │ 816 │ │ self.image2line = image2line() │ │ 817 │ │ self.line2image = line2image(device="cuda:0") │ │ 818 │ │ self.image2hed = image2hed() │ │ │ │ visual_chatgpt.py:247 in init │ │ │ │ 244 │ def init(self, device): │ │ 245 │ │ print("Initialize the canny2image model.") │ │ 246 │ │ model = create_model('ControlNet/models/cldm_v15.yaml', device=device).to(device │ │ ❱ 247 │ │ model.load_state_dict(load_state_dict('ControlNet/models/control_sd15_canny.pth' │ │ 248 │ │ self.model = model.to(device) │ │ 249 │ │ self.device = device │ │ 250 │ │ self.ddim_sampler = DDIMSampler(self.model) │ │ │ │ site-packages\torch\nn\modules\module.py:1671 │ │ in load_state_dict │ │ │ │ 1668 │ │ │ │ │ │ ', '.join('"{}"'.format(k) for k in missing_keys))) │ │ 1669 │ │ │ │ 1670 │ │ if len(error_msgs) > 0: │ │ ❱ 1671 │ │ │ raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( │ │ 1672 │ │ │ │ │ │ │ self.class.name, "\n\t".join(error_msgs))) │ │ 1673 │ │ return _IncompatibleKeys(missing_keys, unexpected_keys) │ │ 1674 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: Error(s) in loading state_dict for ControlLDM: Unexpected key(s) in state_dict: "logvar".

Mar 09 '23 12:03 b2zer

Hi, have you followed our download.sh file to download and implement the ControlNet models?

Mar 09 '23 14:03 ZetangForward

Hi, have you followed our download.sh file to download and implement the ControlNet models?

Yes, albeit from git / lllyasviel (they were the same size as those on huggingface + last commit earlier than my git clone). Just in case, I just re-downloaded them all from huggingface right now (and cloned the repo again) - still the same error. Thanks!

Mar 09 '23 20:03 b2zer

Also, just in case - I have been using ControlNet models successfully [the exact same I tried to use with this repo, initially just linked the models I already had] with AUTOMATIC1111's gradio interface, so there doesn't seem to be anything wrong with the models or loading them just as-is.

Mar 09 '23 20:03 b2zer

That is a strange question. Your error is reported in line 247of the code, but the purpose of this line is to load the model. Since you can successfully use the ControlNet model, this line should be able to run successfully, because we are directly imitating the original repo of ControlNet to load the model. If you still can't solve it, we will launch our online API in the near future, and then you can try it with the API. Thank you for your following!

Mar 10 '23 01:03 ZetangForward

Thanks for the help while I didn't have the time to tinker excessively! Hooray for Friday though - I'll update if and how I fixed the issue in the next days.

Re: API, I find it funny how just days after I posted (SM) Bing's frail attempt to depict something using ASCII art (you could really only tell what it is if you read the text that stated what it is), suggesting MS should "just plug Bing into DALL-E, I mean, you already plugged it into the internet, and that's about as risky as it gets [passive prompt injection, e.g.], so might as well".

And now - this. Good on ya! :-) Don't forget to plug a CLIP into [speculation] GPT-4 [/speculation] with that, too - "CLIP interrogator" actually seems to be quite "aligned" in that regard (albeit a more diverse version NOT tailored to just getting text-to-image prompts would be good; something in between a "brutally honest gradient ascent raw CLIP" and a "CLIP interrogator" maybe!). That would be "a small step for AI" (already existing products), but a giant leap in the evolution of what was previously just "reverse image search".

Anyway, for now, I'd like unlimited turns with this thing - so off to tinkering I go! Thanks for your continued work on useful and entertaining AI products!

Mar 10 '23 20:03 b2zer

    Update 
    [NOT FIXED, but confirmed "something with the ControlNet models / loading is off"]
    -> Without ControlNet, everything else work fine".

I noticed that my startup errors: text_encoder\model.safetensors not found unet\diffusion_pytorch_model.safetensors not found

...Are random, and they sometimes might read e.g. "vae\some-model.safetensors" not found.

Alas, I edited the seemingly offending pipelines (SD 1.5 and inpainting) to point to my local clone instead:

    self.inpainting = StableDiffusionInpaintPipeline.from_pretrained("I:/my/inpainting/files",).to(device)
    self.pipe = StableDiffusionPipeline.from_pretrained("I:/my/SD-1_5/copy/", torch_dtype=torch.float16)

This seems to have fixed the error with "some.safetensors not found".

I grabbed the visual_chatgpt.py from https://github.com/rupeshs/visual-chatgpt/blob/add-colab-support/visual_chatgpt.py -- works just fine, I received a weird image 'from ChatGPT' :-D

But - no ControlNet loaded by that. -> Editing to add one [canny], as that will still fit in VRAM:

Same error. Even when making everything an absolute path to I:, as to avoid any symlinks or relative directories because "you never know". Even with (location = 'cuda'), which makes no difference in producing the same old:

    RuntimeError: Error(s) in loading state_dict for ControlLDM: Unexpected key(s) in state_dict: "logvar".

But I noticed you already have a "CLIP opinion" with regard to a BLIP opinion implemented, nice! ;-)

    System Info just in case.

WinVer - 10/ 22H2 / 10.0.19045 Build 19045

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Tue_Mar__8_18:36:24_Pacific_Standard_Time_2022 Cuda compilation tools, release 11.6, V11.6.124 Build cuda_11.6.r11.6/compiler.31057947_0

Python 3.8.10 Torch Version: 1.13.1+cu116

I actually tried the exact Torch etc. from the requirements (torch==1.12.1, torchvision==0.13.1), got exactly the same KEYS error as above - and thus went back to this Torch. Reason: Because I also got an xformers warning, too, and remembered, "oh, right - I have Python 3.8.10, alas I compiled xformers myself - don't wanna do that again -> back to suitable PyTorch".

The other requirements are all met, though.

Mar 11 '23 01:03 b2zer

PS: instruct-pix2pix also works, the ChatGPT just made a robot have tentacles. It's just ControlNet... Weird.

Mar 11 '23 01:03 b2zer