Stop keywords
Implements https://github.com/ggerganov/llama.cpp/issues/57.
Stop keywords can be specified using the "--stop" parameter. Upon seeing one of these keywords in the generated output, the model will terminate generation immediately. Like reverse prompts, multiple stop keywords can be specified by specifying the --stop argument multiple times.
The implementation is heavily based on the reverse prompt implementation to keep things simple. Tested using 7B (quantized) in both interactive and non-interactive modes.
Great feature!
Somewhat related to this, but for input : https://github.com/ggerganov/llama.cpp/issues/71#issuecomment-1478510617
Perfect!
Why are multiple keywords needed? Isn't just one enough (for example [end of text])?
Sometimes we want to end before Llama decides to generate [end of text], which is the reason this feature was requested. But, sometimes we also might want multiple stop conditions. An example, off the top of my head, might be to stop at the text "YOU WIN" or "GAME OVER" (obviously fairly contrived, but it makes the point that there could be multiple ways in which a generation could terminate).
Also, it's not too hard to implement since the reverse prompt logic already does the heavy lifting. It doesn't really hurt anything to allow multiple keywords, so it seemed like a worthwhile investment.
Unrelated; I've just realized that a lot of conflicts have popped up in this PR. I'll try to correct those over the next few days.
Closing in favor of https://github.com/ggerganov/llama.cpp/pull/769