USBhost

Results 106 comments of USBhost

> @USBhost `For warm up steps that should only apply to constant with warm up scheduler. Iirc.` ![image](https://user-images.githubusercontent.com/4000772/233170841-6146d658-1f83-4f01-9bfe-d78caae23a6d.png) > > All schedulers other than `constant` support a warmup. (inverse_sqrt does...

~~https://github.com/kohya-ss/sd-scripts/blob/main/library/train_util.py#L2451 valid example? AdamW8bit would be nice to have. It will save some vram but idk how much.~~ Forgot iirc that broke after bnb 0.35.0 for SD at lest. edit:...

So I have been testing this pull and I am running into some issues. ``` export CUDA_VISIBLE_DEVICES=0; python server.py --listen --model llama-13b --load-in-8bit bin /UI/text-generation-webui/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so Loading llama-13b... Loading checkpoint shards:...

![image](https://user-images.githubusercontent.com/7269941/232106063-b63181d1-2844-42cd-b88f-16ab25721a44.png) Then I interrupt it. ![image](https://user-images.githubusercontent.com/7269941/232106328-1330e248-3d16-4a19-8d85-63e8a3fd0456.png) When It does not error out do to float division by zero error, it saves? but for a 32 rank is 4k adapter_model.bin size...

This is what I'm talking about, something's not right. @mcmonkey4eva ![image](https://user-images.githubusercontent.com/7269941/232124294-87715b4f-51de-4fde-a6a2-97c3a20c16b4.png) ``` export CUDA_VISIBLE_DEVICES=0; python server.py --listen --model llama-13b --load-in-8bit bin /UI/text-generation-webui/venv/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda118.so Loading llama-13b... Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:12

![image](https://user-images.githubusercontent.com/7269941/232142459-7acc497d-91f8-424d-9531-5f21d7fd512d.png) I oom training 30b because the vram jumps 2x like on 13b from what it takes to train. again stock/default settings but I changed the epoch to 1.

Did you see any VRAM issues? Could peft have caused it as well?

I can confirm going back to 0.37.2 fixed my vram jump.

https://github.com/TimDettmers/bitsandbytes/issues/324 Finally got reported to bitsandbytes

@thot-experiment well currently ooba is broken for whatever reason. ![Screenshot_20230410-231011~2](https://user-images.githubusercontent.com/7269941/231055632-9ae6cb08-7643-420c-b38c-c30a38c9c4db.png)