llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Jeopardy Example Script

Open CRD716 opened this issue 3 years ago • 6 comments

Closes #1163

This is pretty much just a straight port of aigoopy/llm-jeopardy/ Leaving as a draft since it's still missing a lot of features, and I will continue to work on it to make it more usable.

CRD716 avatar Apr 25 '23 03:04 CRD716

All that's left is the readme.

CRD716 avatar Apr 27 '23 05:04 CRD716

I don't think this is correct:

1,The Oscars,Who is John Williams?,Which actor Born in 1932 was the son of a percussionist in the CBS radio orchestra has been nominated for 53 Oscars?

The question should be in the answer format, like so:

1,The Oscars,Who is John Williams?,Born in 1932 & the son of a percussionist in the CBS radio orchestra, he's been nominated for 53 Oscars

SlyEcho avatar Apr 27 '23 07:04 SlyEcho

I don't think this is correct:


1,The Oscars,Who is John Williams?,Which actor Born in 1932 was the son of a percussionist in the CBS radio orchestra has been nominated for 53 Oscars?

The question should be in the answer format, like so:


1,The Oscars,Who is John Williams?,Born in 1932 & the son of a percussionist in the CBS radio orchestra, he's been nominated for 53 Oscars

I haven't changed the prompt question from the original repository. The point is to see if it can bring up facts, not if it can play Jeopardy as it is on the show.

CRD716 avatar Apr 27 '23 14:04 CRD716

I haven't changed the prompt question from the original repository. The point is to see if it can bring up facts, not if it can play Jeopardy as it is on the show.

Then the answer should be "John Williams" not "Who is John Williams?"

SlyEcho avatar Apr 27 '23 14:04 SlyEcho

I haven't changed the prompt question from the original repository. The point is to see if it can bring up facts, not if it can play Jeopardy as it is on the show.

Then the answer should be "John Williams" not "Who is John Williams?"

See https://github.com/aigoopy/llm-jeopardy/blob/main/qasheet.ods and #1163, I would assume we're trying to use the same data as everyone else, so I'm not sure if this issue is supposed to be an implementation of aigoopy's jeopardy or just something with a similar style. @ggerganov which would you prefer?

CRD716 avatar Apr 27 '23 14:04 CRD716

The columns "Original Answer" and "Original Correct Question" in the spreadsheet is the data they used (what is the source? maybe https://j-archive.com/). Then they created "Model Prompt" where it has been turned into a question, and for all the models, they are also answering in an answer format, explained in Reddit.

But anyway, I think this test should be either question-answer or jeopardy style answer-question, but not a mix.

If we don't change the data from the original, we could possibly evaluate a much larger dataset without having to manually edit questions.

SlyEcho avatar Apr 27 '23 14:04 SlyEcho