No change for total loss during finetuning using SAT
Hi, when I tried to use lora to finetune the model, I found that the total loss did not decrease during training. The start loss was quite small (like 0.1 for the Disney dataset), and vibrate a little bit randomly during the training process. I am wondering if it is reasonable, and what is the meaning of "total loss" shown in the log info? Also, if there's any way for us to evaluate the performance of the finetuned model quantitatively? Thank you!
@qidai2000 @zRzRzRzRzRzRzR I have also seen that issue trying the latest code in this repo with "THUDM/CogVideoX-5b-I2V" i2v, the loss is quite low, and random goes up and down constantly, is this ok ? Im trying with a bigger dataset and is the same.
Same here, not quite sure what to infer from this?