ELLA icon indicating copy to clipboard operation
ELLA copied to clipboard

The effect of training data?

Open XiaoBuL opened this issue 2 years ago • 3 comments

Hello, thanks for your great work.

I'm curious about the effect of training data. Did you ever directly fully fine-tune the SD-1.5 or SD-XL model on the training data?

I guess fine-tuning SD-1.5 can also benefit from the training data, e.g. the T2I-CompBench performance.

Can you report the performance of fine-tuned SD-1.5 or SD-XL on your training data?

Thanks!

XiaoBuL avatar Mar 13 '24 03:03 XiaoBuL

Yes, the fine-tuned SD-1.5 or SD-XL may gain some improvements with our training data. However, for the following 2 reasons:

  1. our data has some prompts that are more than 77 tokens
  2. we hope our model can be easily incorporated with community models and downstream tools.

Thus, we didn't fine-tune SD-1.5 or SD-XL.

fangyixiao18 avatar Mar 13 '24 12:03 fangyixiao18

Thanks for your reply!

You may ignore the tokens that are more than 77 to fine-tune the SD-1.5 or SD-XL.

I'm still curious about whether the lift is from the LLM or the training data.

XiaoBuL avatar Mar 13 '24 16:03 XiaoBuL

Actually, we're quite curious about it. We'll try to gather enough GPU to finetune SD1.5 ~or SDXL~

budui avatar Mar 13 '24 17:03 budui

Thanks! Looking forward to your results!

XiaoBuL avatar Mar 23 '24 11:03 XiaoBuL

we fine-tuned the whole U-Net of SD v1.5 using the proposed datasets while adhering to the same training hyperparameters as employed for ELLA-SD1.5, which incorporates T5-XL and TSC. Both models underwent training for 140,000 optimization steps, corresponding to approximately one epoch:

image

budui avatar May 17 '24 07:05 budui