OneTrainer icon indicating copy to clipboard operation
OneTrainer copied to clipboard

[Feat] Extend metadata to save captions

Open Ariloum opened this issue 1 year ago • 5 comments

Hello, can we please have more json metadata in the result LoRA's safetensors file? kohya-ss has lots of information like:

  • trigger words,
  • which model were used,
  • how many images and it's bucketed resolutions,
  • caption's words frequency,
  • every parameter used for the trained model...

And maybe something else.

Ariloum avatar Feb 27 '24 21:02 Ariloum

Agreed! It would be nice to have similar metadata such as what koyha_ss bakes in.

entmike avatar Jun 21 '24 03:06 entmike

There already is an option to save the entire training configuration in the safetensors file. The only thing not included are the captions and trigger words, because those don't really exist within OneTrainer. The data loader is not set up to provide that kind of information

Nerogar avatar Jun 21 '24 06:06 Nerogar

The data loader is not set up to provide that kind of information

actually that's what I miss - after if you train 10+ loras you will start to forget most trigger words after half a year and storing all that in separate files are not too handy. that's one of the reasons I stopped using OneTrainer.

Ariloum avatar Jun 21 '24 09:06 Ariloum

There already is an option to save the entire training configuration in the safetensors file. The only thing not included are the captions and trigger words, because those don't really exist within OneTrainer. The data loader is not set up to provide that kind of information

I must have overlooked it even after looking at what I thought was all the documentation. Any hint on where?

EDIT: Duh! I found it :) Sorry!

entmike avatar Jun 23 '24 01:06 entmike

The data loader is not set up to provide that kind of information

actually that's what I miss - after if you train 10+ loras you will start to forget most trigger words after half a year and storing all that in separate files are not too handy. that's one of the reasons I stopped using OneTrainer.

Agreed, the two main missing things for me are:

  • Training tokens/text used (similar to what is in Koyha)
  • Storing the sha256 hash of the base model used in training. Right now, it only saves the filename which while helpful, is not as definitive as a hash.

entmike avatar Jun 25 '24 19:06 entmike