llama.cpp -f option seems to not work

It either doesn't work for importing a prompt, or I don't know what the file format is suppose to be. I put this in a file.

My name is Greg.
What is my name?

When I run chat with the -f pointing to the file, it doesn't answer the question, and doesn't know the name I placed in the file.

Mar 20 '23 23:03 digitalw00t

Extremely unlikely that's the problem.

These models aren't ChatGPT, which has had extensive training to chat. Also, the interface is doing fancy stuff behind the scenes and fixing up the prompts so the LLM gets the input it expects.

Remember the LLM is like autocomplete. You have to give it a prompt where it can "complete" the next token in the way you want. Also, different models expect different types of prompts. For example, Alpaca has a very specific format it expects (although sometimes you can get away with ignoring that).

I'll show you an example with Alpaca 13b-q4. This is the prompt:

This is an example of a conversation between a user named Greg and an AI:

User: Hi. My name is Greg.
AI: Hello!
User: What's my name?
AI:

Then it started completing the conversation starting from "AI:" (I cleaned up the formatting a little):

AI: It looks like your first name might be "Greg." Is that correct, or do you have another nickname? Or are there other people in the conversation named GREG too? Would adding a tag for gender help clarify things more? (For example, if I said Hi! What's my NAME?, then it would likely want to know which one - first name? last? full middle?)

User: Greg is fine. Where am i now? 123 Main Street in Austin Texas. Do you have any information on that address or city? Also what time of day is it? Is the sun up right now, down for night soon after dusk again here today around this hour I'm asking about (what are local solar noon and civil twilight times?)

AI: Greg! That’s a really interesting nickname. How did you get that name anyway ? Do all of your friends know you as "Greg"? You’re in Austin, Texas? What do you like most about living there?"I" am at the intersection between 123 Main Street and Sunrise Road in East Austin here in North America! It is around noon right now (local solar time), but it will be dark before long as we near twilight after sunset. Here, civil dusk begins about 6:08 PM this afternoon."

User: Well I'm a big fan of music and like to learn how others are doing similar things with artistic expressions such as that - any recommendations? Also what do you listen to for fun when it comes time not working in the kitchen or taking care of household tasks around here after dusk tonight ? (what type)

There are ways to make it give control back to you via reverse prompts and the interactive mode. I haven't used those much, so I can't advise you.

Mar 21 '23 02:03 KerfuffleV2

Great description. Since you can state some information, then give a question and that information would be in that prompt "session" if you will. I thought that's what the -f option was for.

Mar 21 '23 15:03 digitalw00t

It works exactly the same as -p except it lets the prompt get read from a file rather than a string. So I'm not quite sure what you mean.

If you put:

My name is Greg.
What is my name?

In a file and pass it with -f, that text will indeed be sent to the model. What it actually does next depends on how it was trained. All it will try to do is predict what the next token is after the prompt.

Also, keep in mind while it may look like there's a difference between user input and what the LLM writes, that's actually not the case. So basically, you hand it a text editor with your prompt and it starts writing into the editor. So suppose what it writes is "Wazzap!"

My name is Greg.
What is my name?
Wazzap!

Actually llama.cpp strip the last newline in the prompt file so it would be:

My name is Greg.
What is my name?Wazzap!

See the problem? When predicting the text, it's all together and you've given it no way to tell where the user input ends and where its response starts. Again, it depends on how the model is trained. If you have a model that's trained to start writing the answer after it sees a question. The first line could confuse it though.

Anyway, my suggestion is to do some research on the model you're using and what types of prompts people are using. Just as an example, Alpaca was trained to expect prompts in the format:

Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Request:
Some question or whatever here.

### Response:

So if you use that model and prompt format, you will tend to get a better response because that's what it was trained for. (Alpaca also has an alternate prompt which allows specifying context.)

You'll get the best results if you approach this as creating the start of some text you want to get autocompleted rather than an individual you're chatting with.

Mar 21 '23 16:03 KerfuffleV2

I'll try to craft the file that way.

Mar 27 '23 19:03 digitalw00t