Feature suggestion: For a given set of assistant answers to a prompt, let the users generate a synthesized compund answer

Open michaelbogdan opened this issue 2 years ago • 0 comments

This might be useful for two things:

Sometimes after answering a prompt myself, I see several other answers touching on points I missed in my own answer. Or I see a style better suited to the prompt, but factually wrong. In any case, a compound answer synthesizes the best parts of all assistant answers and might provide even better examples of good answers than any one human could provide.
The combination of answers before compunding and the compund answers might be used to train a model that samples the possible answers an LLM might give to a prompt and then synthesizes a compund answer to use in training either the reward model or the LLM itself.

Feb 18 '23 18:02 michaelbogdan