OneTrainer icon indicating copy to clipboard operation
OneTrainer copied to clipboard

[Feat]: LoKr Support

Open dxqb opened this issue 9 months ago • 13 comments

Describe your use-case.

https://arxiv.org/pdf/2309.14859

User reports from using other trainers:

Vram/performance cost is on par with lora training, while having better quality than traditional LoRa's while not having the 30% extra performance penalty due to extra maths required from DoRA, its inbetween the two. In very large runs (>60k) Lycoris/lokr (and DoRa) is where users claim it hits a sweet spot.

KohakuBlueleaf uses it for his projects (His finetune Blueleaf was the basis for Illustrious). Most that have tried it, do vouch for its improvementsso far, another user mentioned not seeing an improvement in his tests - but neither was it in any way worse than peft (he mostly uses very smaller datasets. sub 30 images.)

Lokr loras are standard on simpletuner discord. for me, it has proved itself over and over and i only do lora's with lokr( with the exception of sdxl.)

Examples:

  • this is just for sd 3.5.
  • the results for flux is insane.

What would you like to see as a solution?

Necessary steps:

  • write a small guide on hyperparameters etc.
  • mathematical implementation seems rather trivial. Maybe 30 lines in LoRAModule.py, as it's already well-abstracted
  • what's necessary to generate LoKr-files that are supported by popular inference tools?

Have you considered alternatives? List them here.

LoRA, LoHA

dxqb avatar Apr 15 '25 11:04 dxqb

i tried LoKr on kohya for flux . it has better multi concept training, still far from non-bleeding, but lower quality in my experiments

FurkanGozukara avatar Apr 15 '25 11:04 FurkanGozukara

i tried LoKr on kohya for flux . it has better multi concept training, still far from non-bleeding, but lower quality in my experiments

I believe and have found from my own testing that Kohyas LoKr implementation is inferior to simpletuner and ai-toolkit.

CodeAlexx avatar Apr 15 '25 14:04 CodeAlexx

i tried LoKr on kohya for flux . it has better multi concept training, still far from non-bleeding, but lower quality in my experiments

I believe and have found from my own testing that Kohyas LoKr implementation is inferior to simpletuner and ai-toolkit.

could be.

FurkanGozukara avatar Apr 15 '25 14:04 FurkanGozukara

i tried LoKr on kohya for flux . it has better multi concept training, still far from non-bleeding, but lower quality in my experiments

I believe and have found from my own testing that Kohyas LoKr implementation is inferior to simpletuner and ai-toolkit.

could be.

those sd 3.5 images (large and medium ) lokr loras are mine on the discord. from ai-toolkit, sir, they don't look lower quality to me..and its sd 3.5! if you only use koyha, what you say makes sense. bleeding is stopped with proper dataset and captions to remove it.

CodeAlexx avatar Apr 15 '25 14:04 CodeAlexx

reference implementation with key suffixes that seem to be supported by inference tools: https://github.com/KohakuBlueleaf/LyCORIS/blob/main/lycoris/modules/lokr.py

dxqb avatar Apr 27 '25 11:04 dxqb

i tried LoKr on kohya for flux . it has better multi concept training, still far from non-bleeding, but lower quality in my experiments

I believe and have found from my own testing that Kohyas LoKr implementation is inferior to simpletuner and ai-toolkit.

Any insight on why that might be? They both seem to be using just the LyCORIS code. The only thing I've found SimpleTuner doing in addition is initialize the network differently: https://github.com/bghira/SimpleTuner/blob/d107b2bc17618fc64c5ef7f5478ccd64148a51e5/helpers/training/trainer.py#L729

How are you using this parameter init_lokr_norm in practice? What impact does it have?

dxqb avatar Apr 28 '25 09:04 dxqb

i tried LoKr on kohya for flux . it has better multi concept training, still far from non-bleeding, but lower quality in my experiments

I believe and have found from my own testing that Kohyas LoKr implementation is inferior to simpletuner and ai-toolkit.

Any insight on why that might be? They both seem to be using just the LyCORIS code. The only thing I've found SimpleTuner doing in addition is initialize the network differently: https://github.com/bghira/SimpleTuner/blob/d107b2bc17618fc64c5ef7f5478ccd64148a51e5/helpers/training/trainer.py#L729

How are you using this parameter init_lokr_norm in practice? What impact does it have?

I think its not so much lycoris lib as it it's koyha the trainer. I see better results in simpletuner and ai toolkit. Why is that ? I don't have a clue..

CodeAlexx avatar Apr 29 '25 01:04 CodeAlexx

would be pretty nice to have lokr support, kohyas implementation is okay, simple tuners seems pretty well done.

Need to be able to load in a custom lokr config as well

which you can control the factors for each module and overall module

https://github.com/bghira/SimpleTuner/blob/main/config/lycoris_config.json.example

https://github.com/KohakuBlueleaf/LyCORIS

https://github.com/KohakuBlueleaf/LyCORIS/blob/main/lycoris/modules/lokr.py

DarkViewAI avatar May 17 '25 10:05 DarkViewAI

Can anyone who uses LoKr provide your smallest-possible testcase that shows LoKr working as you expect?

Some public dataset and config that, ideally, works with LoKr but fails with standard LoRA. This would both serve as motivation to implement LoKr into OneTrainer, and as a testcase to test the implementation.

dxqb avatar May 17 '25 12:05 dxqb

Caith had made a lokr before, i stuggle with concepts bleeding with regular lora training,

https://civitai.com/models/714292/juno-for-flux-by-caith-overwatch-2

I think in general it works better. Here are some keypoints

  1. Less prone to overfitting
  2. File size controllability with factor size.
  3. Also when using bypass_mode it saves vram and trains faster and better with batch size.
  4. Multi character and concept training, i found it better when using with prior model predication and when training two characters together
  5. Full matrix mode i think its similar to using full finetuning with dim 10000

DarkViewAI avatar May 17 '25 20:05 DarkViewAI

Also im not sure about conv_dim and conv_alpha

works good to train those on sdxl , not sure about flux but would be good to have

DarkViewAI avatar May 18 '25 02:05 DarkViewAI

I’m going stress Dxqs point here, to convince Nero to merge a resulting PR you need to provide a reproducible dataset (configs) and/or conclusive experiment where it’s clearly demonstrated with evidence that LoKr works or large improvement where traditional does not, from past comments isn’t convinced by anecdotes.

This means a huge amount of sweeps with traditional and lokr and many samples. Lokr has come up in the past and not a single person has been able to to provide conclusive proof for Nero to observe. DoRA had proof, that’s why it got merged

P.S I am not making a judgement on the effectiveness of Lokr, just explaining on how to maximise chances. Please try to keep noise in this thread, hearsay or anecdotal stuff to a minimum as it’s not helpful.

O-J1 avatar May 18 '25 05:05 O-J1

I wouldn't put it as strongly, because of many anecdotal reports that LoKr works better - among them clearly very experienced trainers. A testcase and your hyperparameters will still be useful.

dxqb avatar May 18 '25 06:05 dxqb