USBhost

Results 106 comments of USBhost

See the comment on this https://github.com/oobabooga/text-generation-webui/commit/23a5e886e1aa6849e0819256c3bb4b2bf7d8358e

I did not see this pull mybad.

Just fyi decapoda-research is extremely out of date. Please use [huggyllama](https://huggingface.co/huggyllama) instead.

This sounds amazing also reminds me of https://github.com/wawawario2/long_term_memory

Do you happen to have more than one 4bit model? If so then revert 8c6155251ae9852bbae1fd4df40934988c86a0b1 and report back

> Or make configurable. Put that and dropout under an "Advanced" accordion. @mcmonkey4eva Yes please. Also just now reading that and looks at my almost done 14h 30b training... oops...

``` export CUDA_VISIBLE_DEVICES=0; python server.py --listen --model llama-30b --load-in-8bit Gradio HTTP request redirected to localhost :) Loading llama-30b... Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 7/7 [01:03

> just the `e` notation suddenly appearing because of how low it got. Oh I did not catch that. > Alpaca's trainer just always shuffles by default without an option...

Interesting https://moon-ci-docs.huggingface.co/docs/transformers/pr_1/en/main_classes/trainer#transformers.TrainingArguments.lr_scheduler_type

For warm up steps that should only apply to constant with warm up scheduler. Iirc. Also might as well throw in 8bitAdam save some vram. Since we're most likely loading...