OneTrainer icon indicating copy to clipboard operation
OneTrainer copied to clipboard

embeddings not preserved when resuming training after completition [Bug]:

Open MartinoCesaratto opened this issue 1 year ago • 1 comments

What happened?

After training a LORA with multiple additional embeddings, I decided to do a few more epochs of training. Noticed that the samples were quite different, and after some testing I can say that quite certainly it retained the LORA but started training the embeddings from scratch. If I explicitly tell it to load embeddings from a file I can correctly load the latest ones, but it shoulden't actually load them from file since it already got the trained ones in the backup.

For context, the training started from readily-trained embeddings, but I kept training all of them.

If I stop the training and then resume it, it correctly loads the latest embeddings from the backup, the issue is only present if I already trained for all the epochs I initially set, then add more and resume training.

What did you expect would happen?

It should always load the embeddings from the backup, if they're present there

Relevant log output

No response

Output of pip freeze

No response

MartinoCesaratto avatar Jun 21 '24 15:06 MartinoCesaratto

sorry for the UP, but I feel this is important and people may not notice.

Repro:

SD1.5 with EMA -> train LORA + embedding for a few steps -> sample -> stop training -> start training from backup -> sample.

The non-EMA sample will be very similar to the one with 0 training, the EMA one is identical to the previous one, but if you keep training it gets updated with the "new" one and it starts becoming worse.

Without EMA it doesn't seem to do that.

MartinoCesaratto avatar Sep 20 '24 15:09 MartinoCesaratto

sorry for the UP, but I feel this is important and people may not notice.

Repro:

SD1.5 with EMA -> train LORA + embedding for a few steps -> sample -> stop training -> start training from backup -> sample.

The non-EMA sample will be very similar to the one with 0 training, the EMA one is identical to the previous one, but if you keep training it gets updated with the "new" one and it starts becoming worse.

Without EMA it doesn't seem to do that.

@MartinoCesaratto Could you please run update.bat/sh and then try this again?

O-J1 avatar Oct 13 '24 16:10 O-J1

@MartinoCesaratto Bump, can you please clarify if this still occurs if you update?

O-J1 avatar Nov 04 '24 07:11 O-J1

Cant repro and have not heard back from author

O-J1 avatar Feb 14 '25 13:02 O-J1