USBhost comments

Results 106 comments of


                                            USBhost

Move to updated GPTQ with new PTB and C4 eval

Okay.

Move to updated GPTQ with new PTB and C4 eval

> I'd try the latest default with the smallest model first to make sure that the quantization works and that the resulting safetensors can be loaded in the web UI....

Move to updated GPTQ with new PTB and C4 eval

``` remote: Enumerating objects: 7, done. remote: Counting objects: 100% (7/7), done. remote: Compressing objects: 100% (1/1), done. remote: Total 4 (delta 3), reused 4 (delta 3), pack-reused 0 Unpacking...

Move to updated GPTQ with new PTB and C4 eval

``` Traceback (most recent call last): File "/UI/text-generation-webui/server.py", line 234, in shared.model, shared.tokenizer = load_model(shared.model_name) File "/UI/text-generation-webui/modules/models.py", line 101, in load_model model = load_quantized(model_name) File "/UI/text-generation-webui/modules/GPTQ_loader.py", line 69, in load_quantized...

Add support for the latest GPTQ models with group-size

GPTQ 4bit does not load if it was made with [act-order](https://github.com/oobabooga/text-generation-webui/issues/541#issuecomment-1483184836). I am currently testing true-sequential if that's okay. Edit: true-sequential loads so only act-order is broken.

Add support for the latest GPTQ models with group-size

We could follow the naming of the GPTQ example https://github.com/qwopqwop200/GPTQ-for-LLaMa/blob/main/README.md?plain=1#L119 ? Having to use a file just for one parameter seems wasteful.

Add support for the latest GPTQ models with group-size

Anyways tomorrow I should have a torrent up with all the new 4bit stuff. here's my over night cooking recipe. ``` #!/bin/bash . ../venv/bin/activate; python ../repositories/GPTQ-for-LLaMa/llama.py llama-65b c4 --new-eval --wbits...

Add support for the latest GPTQ models with group-size

> @USBhost, one last thing: if you create a torrent, make sure to put the tokenizer and config.json files in the respective folders just like in https://huggingface.co/ozcur/alpaca-native-4bit so that we...

Add support for the latest GPTQ models with group-size

> And make sure it says `LlamaTokenizer` rather than `LLaMaTokenizer` otherwise we'll have another big round of support issues 😄 It's straight from the latest convert script so I think...

Add support for the latest GPTQ models with group-size

Just a heads up I am experiencing some interesting things. https://github.com/qwopqwop200/GPTQ-for-LLaMa/issues/78