LLamaSharp icon indicating copy to clipboard operation
LLamaSharp copied to clipboard

Wrong result when change to other model.

Open icemaple1251 opened this issue 1 year ago • 3 comments

It works well when I use LLama2-7b-Chat, but when I changed the model to a new version mixtral-8x7b-v0.1Q2_K, when I ask the same question it seems that the robot gave a wrong answer, and it even changed my original question.

Should I change some options or parameters some where when I change to another model? Anyone can help me? thanks. wrong correct

icemaple1251 avatar Feb 02 '24 07:02 icemaple1251

Q2 is a pretty small quantisation, have you tested your Q2 model in llama.cpp directly to check this isn't just a bad response caused by the quantisation?

martindevans avatar Feb 02 '24 14:02 martindevans

I have not tested your Q2 model in llama.cpp directly. But I do have try other models like "mixtral-8x7b-v0.1.Q8_0.gguf" I still get wo wrong answer, some answers may be repeated for several times. If some models are special for chat but others are not?

icemaple1251 avatar Feb 06 '24 08:02 icemaple1251

The mixtral model you mentioned is Q8, which is much more forgiving than Q2. The smaller than number the more the model has been compressed, and the more likely it is to give bad answers.

martindevans avatar Feb 06 '24 16:02 martindevans