ggllm.cpp --help , pipes and inconsistent help text

I have always been irritated (also in Llama.cpp) by the fact that the help text cannot be piped. Neither a falcon-main --help |less (the help is meanwhile 85 lines long) will work nor a falcon-main --help |grep -A3 penal (which should e.g. provide a compilation in a few lines with everything that has to do with penalties commands). That is because the help output is written to stderr. But a help text output with --help is not an error message, but the desired text output of the help command.

These changes can be done quickly, just a lot of lines - but I also noticed a lot of inconsistencies in the help text. I would like to change them together with the correction of the output code.

The help text has a basic structure: Entries start with the argument or option preceded by - or --, followed by an argument in capital letters if one is provided, e.g. in the line

-p PROMPT, --prompt PROMPT

The argument has now been omitted in many new entries. I will add these again. But some lines become very long because of this, take for example:

-a,--alias,--finetune Set model name alias and optionally force fine-tune type (or disable it)

should actually be

-a ALIAS, --alias ALIAS, --finetune ALIAS Set model name alias and optionally force fine-tune type (or disable it)

It would be consistent to write it this way (and I'm all for writing the help text consequently consistent), but it doesn't make it any clearer. So my suggestion is to change the notation and basically write the argument only once after the comma-separated list of option notations. Then it would be

-s, --seed SEED               RNG seed (default: -1, use random seed for < 0)
-p, --prompt PROMPT           prompt to start generation with (default: empty)
-a,--alias,--finetune ALIAS   Set model name alias and optionally force fine-tune type (or disable it)

As a benefit there would be some more lines where the description still fits in the same line as the parameter list. If I get approval for this, I would send the mentioned changes all together into a PR.

Jul 19 '23 15:07 maddes8cht

I agree, especially confusing for less experienced people. Some of the original names are also not well chosen, things are often presented without a real explanation. I think it would make sense to split the help into sections. A general minimal help covering the top ~10 options And -h sampling/interact/gpu/cpu/server/extended (or similar) groups that present the options fitting.

Jul 19 '23 18:07 cmp-nct

Working on it. I think i have good sections.

Questions on the descriptions: The --reverse-prompt section has been removed from the help text, but can be found among the validated parameters and works. Is there a reason for this? Should the --stopwords be the replacement for this? But then the functionality of --reverse-prompt should also be removed, right?

As long as --reverse-prompt works, I'll add the description back in for now, adding a "deprecated" to it? Or should we have a more distinctive description here?

I'm still confused about the --keep flag, as i mentioned in #46. Always when --keep N is mentioned, it is stated

number of tokens to keep from the initial prompt (default: 0, -1 = all)

To me, this sounds exactly like what I would expect from a "system" message: The initial prompt, the very first prompt of the conversation, should always remain in context. This is how I understand this line. In my attempts I always had the impression that this does not work and that the initial prompt always disappears from the context. For clarification: Is it rather the case that --keep merely controls that in a concrete answer the user's previously requested prompt remains in the context (to the selected portion)? (And in contrast to the ìnitial`, i.e. the very first prompt?) While older parts of the context are nevertheless pushed out? Such a setting makes sense, too, of course - also and especially together with a system prompt that does not disappear from the context even in a longer conversation.

Jul 20 '23 09:07 maddes8cht

I'm not sure if the "anti prompt" (which is a really bad name) still has a purpose. Same with reverse prompt. The code has not yet been removed but I believe both are obsolete. optional prompt post/prefixing, automated fine tune syntax and stop words took those jobs. I removed them because the help is already confusing enough, additional redundant options just mislead people. I'd abstract the help into an own function that displays help and has an argument to display help for a specific section.

keep: The system prompt uses "keep" under the hood. Most real life use of "keep" is likely covered by using -sys instead. Keep only plays a role during context switch, once you process more than n_ctx tokens, keep will be checked. Without keep any tokens (starting at prompt) will be lost from the start on. With keep the earliest "n" tokens are kept.

Jul 20 '23 23:07 cmp-nct

I'm pushing my PR with the help text now. The help text got significantly longer, but also more meaningful and yet clearer and more structured (at least I think so).

The output goes to stdout, which means that the help can finally be used in pipes - and I think this is not an uncommon use of pipes, of less and grep. With a good grep, a --help CATEGORY command becomes almost obsolete for a user, you then quickly find what you are looking for anyway.

Nevertheless, the thought of a help function with arguments, with which one can indicate sections, I also had. The problem is that the current structure of gpt_params_parse does not allow for parameters with optional arguments. Even my proposed validateParams function can't do that yet. Therefore I have not implemented this help with optional section argument here yet, because a --help that works ONLY with sections is really bad, as you will not get any help with --help without arguments.

Next i will implement a more proper validation code that should: validate the params, allows validating for numbers, non-negative numbers, strings (right now, most of the params that expect a number as argument will make the program die without any message if you provide a letter or string as argument) and gives according message if an argument doesn't fit Allow checking of optional arguments, which makes a --help CATEGORY command possible.

Give me a line if you appreciate that, then I'll do it.

Jul 21 '23 18:07 maddes8cht

Just merged it, help looks much more clean now. I don't think grep is a real replacement to sections as people often lack that skill and it's not available on windows by default. But my guess is that the usage of ggllm will move from console into server-based in the mid future. So console usage will be for advanced usage (or to interface with the server) anyway. Probably good to wait with making help more complex for now and see how things go.

Validation sounds like a good idea, also through an encapsulated function so it's easy to maintain. Validation should always be optional so adding a new parameter can be done quickly.

Jul 21 '23 23:07 cmp-nct

Ah - how will you start your server?

The server will still remain a cli application which will have the same help as this one. And if you intend it to be all in the same program (which i like - in Llama we already seem to have the situation that people implement features in main, but forget it in server or vice versa), the help will need another section, making it even longer.

So, a good help and a good validation that actually gives messages about things that went wrong on the given command line is even more needed.

Jul 22 '23 07:07 maddes8cht

Yep, llama is still mostly handled like a ggml example project. I wish to move away from that. so one core binary. Validation is useful, sure.

Jul 22 '23 20:07 cmp-nct