diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

SD3 cannot finetunes a better model (hand and face deformation)?

Open KaiWU5 opened this issue 1 year ago • 3 comments

Describe the bug

I want to finetune sd3 to improve its human generation quality with 3million high-quality human datasets (which has been proven useful on sdxl and other models). But hand and face deformation doesn't improve much after two days of training.

I am using train script

What I have been done so far:

  1. regular training with 3 million data with batch size 2x24(V100) for 2 epochs with lr 5e-6 and adamw optimizer
  2. prodigy optimizer training with same setting
  3. Add q,k RMS norm to each attention layer
  4. only train several blocks

All of my training gives me nearly the same deformation results, where the hands are never normal like human.

Could you some provide more experiments about sd3 training? There seems no easy way to adapt sd3 for human generation

Reproduction

Has described in bug part

Logs

No response

System Info

V100 24GPU, batchsize 2 for each card, 3 million human data with aesthetic score > 4.5

Who can help?

No response

KaiWU5 avatar Jul 01 '24 07:07 KaiWU5

Hi @KaiWU5 I think this question would be better to ask in the Discussions section.

DN6 avatar Jul 01 '24 08:07 DN6

You can show me your loss training

mliand avatar Jul 02 '24 03:07 mliand

I have the same question.

heart-du avatar Jul 02 '24 08:07 heart-du

image Here's my loss curve. The loss after adding qknorm is similar. Thanks for noticing and I will continue discussing the results in the discussion section.

KaiWU5 avatar Jul 11 '24 08:07 KaiWU5