OneTrainer icon indicating copy to clipboard operation
OneTrainer copied to clipboard

GUI Backup - Last Epoch will never be saved

Open arkinson9 opened this issue 2 years ago • 6 comments

In the GUI under "Backup" I use the setting: "Save After 1 EPOCH". But it seems that the final Epoch will never be saved.

For example - Lora training with 10 Epochs: It saves correctly the safetensors files for Epochs 1 to 9, but after execution of Epoch 10 no file is saved.

arkinson9 avatar Jan 27 '24 12:01 arkinson9

Can see this. However perhaps its included as part of the final save instead? Also thought this was intentended behaviour

O-J1 avatar Jan 27 '24 14:01 O-J1

Thank you for your reply. You are right. I just found it now. The last Epoch is saved in the path specified under:

"model" -> "Moldel Output Destination".

I never had a look there before.

But maybe it would be a nice feature to have all generated loras in one path - especially in the same file format like the "saved" ones (timestamp-steps-epoch.safetensors).

arkinson9 avatar Jan 27 '24 15:01 arkinson9

it would be a nice feature to have all generated loras in one path

Not sure why this isn't the default - if the whole point of making backups is to compare them later, why wouldn't we have them all saved in one place AND named appropriately?

Zueuk avatar Mar 14 '24 17:03 Zueuk

Backups are for the trainer and training. Saves are for testing intermediate steps in your inference tool of choice.

Calamdor avatar Mar 14 '24 17:03 Calamdor

I think this is indeed a bug, I don't believe finetune has this issue. @Nerogar

SirTrippsalot avatar Mar 14 '24 19:03 SirTrippsalot

Backups are for the trainer and training. Saves are for testing intermediate steps in your inference tool of choice.

Oh I see, somehow I got them confused.

Btw the last time I tried OneTrainer, the final epoch backup seems to have been saved ok

Zueuk avatar Mar 14 '24 23:03 Zueuk

Closing after discussion with Nero. Its too complicated to be worth to fix. No final save occurs because saving only happens at epoch_step 0, missing the last training step at the end of the final epoch.

O-J1 avatar Oct 13 '24 13:10 O-J1