0x3f3f3f3fun
0x3f3f3f3fun
Hello! For the first issue, the training will stop when you reach the maximum training steps, and you can modify your maximum training steps by setting `max_steps` in your training...
Yes, I have also encountered this problem when the GPU memory is not enough.
至少需要11GB,对应的输入大小是512。
It takes 2~3 days :)
Hello! 1. v1_face.pth. This checkpoint contains the weight of IRControlNet, which receives a smooth face image as condition and output a high-quality restoration result. 2. I am not familiar with...
> Hi, thanks for replying. Please tell me which VAE you are referring to. Please guide me on how to verify the VAE. Follow the instructions below: 1. load ControlLDM...
Sorry, AdamW is right. We will update our paper.
> > Sorry, AdamW is right. We will update our paper. > > 你好,想接着再问一下为什么可以实现任意upscale的sr 生成吗?我看了paper没有太懂这一步,我知道lq经过preprocessmodel之后变成condition,再经过vae encode成latent,model根据latent和xt(随机noise)生成高清sr,但是我不太懂为什么这个sampler可以对不同尺寸不同upscale都适用,想请您讲解一下原因?多谢多谢! 根本原因是SD的UNet可以处理任意大小的latent z,具体一点的话是任意的长宽为8的倍数的latent z。在DiffBIR中condition latent会与z进行concat,所以condition latent的大小决定了z的大小。因此当condition latent的长宽为8的倍数时,UNet可以正常运行。由于VAE降采样8倍,这个条件等价于condition的长宽为64的倍数。在代码中我们也有一个步骤是把condition padding到64的倍数。如果我没说清楚的话,欢迎继续提问。
> > > > Sorry, AdamW is right. We will update our paper. > > > > > > > > > 你好,想接着再问一下为什么可以实现任意upscale的sr 生成吗?我看了paper没有太懂这一步,我知道lq经过preprocessmodel之后变成condition,再经过vae encode成latent,model根据latent和xt(随机noise)生成高清sr,但是我不太懂为什么这个sampler可以对不同尺寸不同upscale都适用,想请您讲解一下原因?多谢多谢! > > > > >...
因为在训练的时候为了产生更大范围的condition,我们把退化的范围设置的比较大,而目前没有off-the-shelf模型是专门针对这种大范围的退化训练的,所以我们自己训了一个。大范围的condition指的是resotration module的输出覆盖了从非常平滑到非常sharp,这样训出来的IRControlNet能够接受的condition就更多了。