haystack icon indicating copy to clipboard operation
haystack copied to clipboard

Design and implement scaffolding for pipeline evaluation

Open shadeMe opened this issue 1 year ago • 0 comments

Implement scaffolding code that:

  • Accepts...
    • The evaluated pipeline, i.e., the pipeline whose output is to be evaluated.
    • A set of inputs for the above pipeline.
    • The evaluation pipeline, i.e., the one with the evaluation components/metrics.
    • A set of additional inputs for the evaluation pipeline, e.g: labels, etc.
  • Runs...
    • The evaluated pipeline with the above inputs.
      • Optionally allows overriding parameters of specific components in said pipeline.
    • The evaluation pipeline with the outputs of the above pipeline.
  • Returns...
    • The results of the evaluation pipeline.

The scaffold will further allow specialization for individual use cases. For instance, an end-to-end RAG pipeline evaluation harness can be built on top of it, implementing a RAG-specific API.

Related to #7415.

shadeMe avatar Apr 10 '24 13:04 shadeMe