USBhost comments

Results 106 comments of


                                            USBhost

Lora Trainer Improvements Part 4: 4-Bit Support and more!

> @USBhost `For warm up steps that should only apply to constant with warm up scheduler. Iirc.` ![image](https://user-images.githubusercontent.com/4000772/233170841-6146d658-1f83-4f01-9bfe-d78caae23a6d.png) > > All schedulers other than `constant` support a warmup. (inverse_sqrt does...

Lora Trainer Improvements Part 4: 4-Bit Support and more!

~~https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py#L2451 valid example? AdamW8bit would be nice to have. It will save some vram but idk how much.~~ Forgot iirc that broke after bnb 0.35.0 for SD at lest. edit:...

Lora trainer improvements part 3

So I have been testing this pull and I am running into some issues. ``` export CUDA_VISIBLE_DEVICES=0; python server.py --listen --model llama-13b --load-in-8bit bin /UI/text-generation-webui/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so Loading llama-13b... Loading checkpoint shards:...

Lora trainer improvements part 3

![image](https://user-images.githubusercontent.com/7269941/232106063-b63181d1-2844-42cd-b88f-16ab25721a44.png) Then I interrupt it. ![image](https://user-images.githubusercontent.com/7269941/232106328-1330e248-3d16-4a19-8d85-63e8a3fd0456.png) When It does not error out do to float division by zero error, it saves? but for a 32 rank is 4k adapter_model.bin size...

Lora trainer improvements part 3

This is what I'm talking about, something's not right. @mcmonkey4eva ![image](https://user-images.githubusercontent.com/7269941/232124294-87715b4f-51de-4fde-a6a2-97c3a20c16b4.png) ``` export CUDA_VISIBLE_DEVICES=0; python server.py --listen --model llama-13b --load-in-8bit bin /UI/text-generation-webui/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so Loading llama-13b... Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:12

Lora trainer improvements part 3

![image](https://user-images.githubusercontent.com/7269941/232142459-7acc497d-91f8-424d-9531-5f21d7fd512d.png) I oom training 30b because the vram jumps 2x like on 13b from what it takes to train. again stock/default settings but I changed the epoch to 1.