llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

example/speculative: drafting fails completely when params.sparams.temp is set to 0

Open mscheong01 opened this issue 1 year ago • 2 comments

Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.

In the current speculative.cpp implementation, params.sparams.temp is forced to -1.0f However, if I change this value to 0: image

draft sampling seems to fail completely: image (speculative.log)

Is this intended behavior? I'm working on #5625 which removes the temperature limit so I'd like to get this fixed

mscheong01 avatar Feb 22 '24 05:02 mscheong01

I guess this is because when temperature is 0, the sampling logic does not output probabilites for each tokens? image If so, this seems like a viable solution. image

mscheong01 avatar Feb 22 '24 05:02 mscheong01

If so, this seems like a viable solution.

Yes, this should work

ggerganov avatar Feb 22 '24 09:02 ggerganov