LLM Eval - Implement Faithfulness/Factual Accuracy metric

Open shadeMe opened this issue 1 year ago • 0 comments

Depends on https://github.com/deepset-ai/haystack/issues/7022.

Wrap LLMEvaluator to provide a component that calculates the "Faithfulness" or "Factual accuracy" based on the following inputs: - Questions - Contexts - Responses

This component is meant to be plug-n-play, meaning it will provide a good enough starting prompt and examples. These should also be customizable by the user.

A requirement for this component is that the LLM is expected to return a binary value for each input tuple. This will let us calculate a final score for the dataset ourselves.

Feb 16 '24 18:02 shadeMe