cidtrips comments

Repositories
Issues
Comments

Results 4 comments of


                                            cidtrips

Assistant always writes weird text

Pretty sure this is a side effect of using a model quantized with a newer version of GTPQ-for-LLaMa. Until GPTQ-for-LLaMa is updated in this package, or more likely, the wheel...

Assistant always writes weird text

> TheBloke has one of his unfiltered models quantized with --no-act-order up on huggingface. That should work for you, and may be a little more helpful than the filtered mode.

GPTQ 4bit support

I've tried several different ways of merging the GPTQ code with fastchat, but keep breaking down at running a 4 bit quantized model on multiple gpus. I go back and...

support for 4bit quantization from transfomer library.

Honestly, it's updating to transformers 4.30, adding one other dependency package, and about 8 changes in the code if I recall correctly. Plus it works with multi-gpus. Unfortunately I lost...