rabidcopy

Results 91 comments of rabidcopy

> It doesn't compile as is because you have made the constant a local in `llama_model_load()`, and this means that it is not visible in `main()` where it is used....

Hold on, something isn't behaving right. I think something isn't working actually. Forgive me but investigating. Edit: I understand what's going on. I was debugging end of text with `fprintf(stderr,...

> That's weird. I merged [330b86e](https://github.com/ggerganov/llama.cpp/pull/333/commits/330b86eed2d4e7e8588f62f5f1aba476e7ac406b) and I didn't have the newline issue. Maybe it has to do with [other tweaks I have](https://github.com/tjohnman/llama.cpp/blob/experimental/main.cpp)? I'm not sure, but it seems to...

> Here is a suggestion: notice that the token is generated in main.cpp:~1003, in this line: > > ```c++ > id = llama_sample_top_p_top_k(vocab, logits.data() + (logits.size() - n_vocab), last_n_tokens, repeat_penalty,...

> > Correct me if I'm wrong, but wouldn't this be getting very close to what the --ignore-eos argument does? > > Not entirely, --ignore-eos prevents eos from being sampled...

Ohh, this is coming before the `if (embd.back() == EOS_TOKEN_ID) {` line. I didn't realize your approach bypasses that entirely. Back to more testing then.

Ouch, new problem with that approach. When using reverse prompt, it does not print the reverse prompt and simply prints a new empty line with control given back to the...

Well, trying to get something together. I feel the changes I made are still better than the former, but I want the reverse prompt stuff to work well with these...

> How about not outputting `` at all? Just set logit of that token to -Infinity before sampling. Model assigns probability to all tokens before sampling trims them, so suppressing...

Hmm. I kinda understand that. The EOS token just seems to be very problematic for interactive mode currently. I want a good solution and while the changes here are better...