flastir comments

Repositories
Issues
Comments

Results 2 comments of


                                            flastir

Bug: Qwen2-72B-Instruct (and finetunes) Q4_K_M, Q5_K_M generates random output with CuBLAS prompt processing

It appears that the Qwen2-72B-Instuct-Q5_K_M model stopped functioning correctly after release b3091. Result in the [release b3091](https://github.com/ggerganov/llama.cpp/releases/tag/b3091): ``` User: who are you? Llama: I am Llama, a friendly and helpful...

Bug: Qwen2-72B-Instruct (and finetunes) Q4_K_M, Q5_K_M generates random output with CuBLAS prompt processing

> Looks like Q5km works ok in latest koboldcpp if I use OpenBLAS instead of CuBLAS for prompt processing. It's slow, though. > EDIT: CLBlast works too on GPU. Try...