blauzim
blauzim
I'm seeing a similar degradation in output quality. It used to match autogptq output quite closely but latest releases seem to be producing different results. I can get back previous...
Here's a adjusted snippet of the code - nothing too complicated. llama is a python class which executes a prompt. I've had the same issue / tried it with multiple...
Thanks, I can run the sample code you provided have and it works cleanly. So must be an issue in the code I'm using / how exllama is being called....
Did some further digging. Seems to be related to creating the generate and tokenizer objects inside the "llama" class. When created at the top level it works, but when the...