ControlNet Much Worse test result when using gradio_canny2image.py than validation result.

Hi Thanks for your great work. I've trained Controlnet using Canny Edge. After 30 epoch, the validation images in image_log show quite realistic good results. So I used the model and run gradio_canny2image.py and tried with a lot of different parameter settings like CFG, but it shows always so bad result compared to the validation results. Is there any difference between validation process or parameter settings and web version?

Please help me~! Thank you!

May 25 '23 05:05 dedoogong

@dedoogong Have you figured out the reason? If yes, kindly elaborate.

Jun 27 '23 06:06 HassanBinHaroon

I have the same problem. I trained a controlnet model with my own dataset, and the results are much worse when I load the model in automatic1111's webui than the results in image_log.

Jul 10 '23 06:07 ELEPOT

Yes I still have the problem. I wonder, what is difference between evaluation process(run on the every 300 iter) or paramter and independent inference(gradio_canny.py) process / parameter settings.

Jul 10 '23 07:07 dedoogong

After digging through some code I found that in the evaluation process, the CFG scale is 9, the sampling step is 50, and the sampler seems to be ddim But after I applyed these changes the problem is still not fixed. Hope it would help in some way.

Jul 10 '23 10:07 ELEPOT

Hi Thanks for your great work. I've trained Controlnet using Canny Edge. After 30 epoch, the validation images in image_log show quite realistic good results. So I used the model and run gradio_canny2image.py and tried with a lot of different parameter settings like CFG, but it shows always so bad result compared to the validation results. Is there any difference between validation process or parameter settings and web version?

Please help me~! Thank you!

Did you upload a Canny image when testing?

Nov 17 '23 09:11 Edwardlmaooooooo

Have you solved the problem? I had the same problem and didn't know how to solve it. If you know the solution, please let me know, thank you very much!

Dec 18 '23 11:12 kako523

Hi I have the same problem,too. Have you figured out the reason? If yes, let me know please.I would appreciate it very much!

Jan 10 '24 08:01 Namn23

This problem has been bothering me for a long time, but after some effort, I wrote a inference script, and now it works relatively normally, hope it can help you.

I figured it might be because the gradio_canny2image.py test code is different from the log_images in the ControlLDM class for training. I'm still not sure why, but I'd like someone to explain what's going on. My work doesn't require a text prompt, so the inference script has no text input.

My code level is limited, if anyone can optimize it again that would be great!

from share import *

from cldm.model import create_model, load_state_dict
import cv2
from annotator.util import resize_image
import numpy as np
import torch
import einops
from cldm.ddim_hacked import DDIMSampler
from PIL import Image


# Configs
resume_path = '/ControlNet/lightning_logs/version_6/checkpoints/last.ckpt' # your checkpoint path
N = 1
ddim_steps = 50


model = create_model('./models/cldm_v21.yaml').cpu()
model.load_state_dict(load_state_dict(resume_path, location='cuda'))
model = model.cuda()
ddim_sampler = DDIMSampler(model)

img_path = 'your image path'
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = resize_image(img, 512)


control = torch.from_numpy(img.copy()).float().cuda() / 255.0
control = torch.stack([control for _ in range(N)], dim=0)
control = einops.rearrange(control, 'b h w c -> b c h w').clone()
c_cat = control.cuda()
c = model.get_unconditional_conditioning(N)
uc_cross = model.get_unconditional_conditioning(N)
uc_cat = c_cat
uc_full = {"c_concat": [uc_cat], "c_crossattn": [uc_cross]}
cond={"c_concat": [c_cat], "c_crossattn": [c]}
b, c, h, w = cond["c_concat"][0].shape
shape = (4, h // 8, w // 8)

samples, intermediates = ddim_sampler.sample(ddim_steps, N, 
                                             shape, cond, verbose=False, eta=0.0, 
                                             unconditional_guidance_scale=9.0,
                                             unconditional_conditioning=uc_full
                                             )
x_samples = model.decode_first_stage(samples)
x_samples = x_samples.squeeze(0)
x_samples = (x_samples + 1.0) / 2.0
x_samples = x_samples.transpose(0, 1).transpose(1, 2)
x_samples = x_samples.cpu().numpy()
x_samples = (x_samples * 255).astype(np.uint8)

image_name = img_path.split('/')[-1]
Image.fromarray(x_samples).save('./outputs/' + image_name)

Mar 19 '24 09:03 SummerWRain